fortran-lang / stdlib

Fortran Standard Library
https://stdlib.fortran-lang.org
MIT License
1.09k stars 167 forks source link

How should stdlib handle single, double, quadruple precision types #25

Closed certik closed 3 years ago

certik commented 4 years ago

When we declare a real variable in our API, we have to use some variable to represent the real kind. Available options for single, double and quadruple precision that people have used in codes:

  1. real(sp), real(dp), real(qp)
  2. real(real32), real(real64), real(real128)
  3. real(r32), real(r64), real(r128)
  4. real(float32), real(float64), real(float128)
  5. real(f32), real(f64), real(f128)
  6. real(wp) (for default real)
  7. real(r4), real(r8), real(r16)

I will keep appending to this list if we find some code that uses different names.

Those variables must be defined somewhere. I will use the case 1. below, for other cases we simply substitute different names. The available options:

a. stdlib_types module that provides sp, dp, qp for single, double and quadruple precision as follows:

integer, parameter :: sp=kind(0.), &             ! single precision
                      dp=kind(0.d0), &           ! double precision
                      qp=selected_real_kind(32)  ! quadruple precision

The main idea behind this option is that there is a module in stdlib that provides the types and every other module uses it. The proposal #13 is similar to it. There are several options how the types are defined inside the module: one can define sp and dp using selected_real_kind also. Another alternative is to define sp, dp and qp using iso_fortran_env as in b. to a. The module stdlib_types can be called differently also.

b. use iso_fortran_env, only: sp=>real32, dp=>real64, qp=>real128 (if the case 2. above is used, then one does not need to rename, so it simplifies to just use iso_fortran_env, only: real32, real64, real128). Unlike a., this option does not introduce a new module in stdlib. One simply uses iso_fortran_env everywhere directly.

c. use iso_c_binding, only: sp=>c_float, dp=>c_double, qp=>c_float128. Unlike a., this option does not introduce a new module in stdlib.

I will keep this list updated if more options become available.

certik commented 4 years ago

My own preference that I have been using is 1. together with a., so 1a.

certik commented 4 years ago

I think the most important is to figure out which of 1. - 7. to use. One criteria is verbosity. These kinds are used all over the place every time you declare a real. So using a two character variable is the shortest, which leads to either 1., 6. or 7. The option 6. only works for a "default" real (whatever it is), and option 7. has three characters for quadruple precision. So the option 1. is the only option that allows to select any precision with just two characters. Using the terms single, double and quadruple are well established terms in Fortran, for example Lapack and other codes use the convention in the subroutine names as s for single and d for double. One could in principle use just single variables s, d and q, but that would clash with user defined variables. So appending p as "precision" is natural. Thus I really like sp, dp and qp, as they are short, thus easy to write and read and follow established Fortran precision terminology.

My next favorite would be 7., as it is two characters for single and double precision, and three characters for quadruple. After that probably 3. and then 2. The 7., 3. and 2. are less clear to read and write, because in Fortran when we talk about precision, most people use the terms single, double and quadruple. It takes time to convert the terms into numbers and to be honest I have to think a bit -- is double precision 8 bytes? How many bytes is quadruple precision? And in Fortran, we typically do not think in terms of how many bits or bytes variables occupy in memory, but rather we think in terms of precision of our numbers. So from this perspective, option 1. is more natural than options 7., 3. or 2.

jvdp1 commented 4 years ago

This should also consider integer (int32,int64,..). My own preference would be to use the definitions of iso_fortran_env directly (so, to use real32, real64, int32,...). It is a bit more verbose, but it is already standart.

So, in your example, it would be 2-.

milancurcic commented 4 years ago

I also like to use built-in type kinds from iso_fortran_env, so my preference is option 2. I like the shorter variants, so 1b, 3b, and 5b are all palatable to me.

There have been concerns that the constants from iso_fortran_env are only portable in terms of byte width and not precision (https://software.intel.com/en-us/blogs/2017/03/27/doctor-fortran-in-it-takes-all-kinds). For example, real128 is not guaranteed to be true quad-precision. I don't know how much is this still an issue.

certik commented 4 years ago

@jvdp1 I think you meant to write 2b. It's a good point that the case 2. uses names that are in the standard.

Note that there is a long discussion about this topic here:

https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=COMP-FORTRAN-90;2d0ccba2.1304

One must click on "Next" by Topic to see the next messages in the thread, so I am posting the links directly here and I'll try to summarize the ideas there later:

https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1304&L=COMP-FORTRAN-90&D=0&P=18711 https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1304&L=COMP-FORTRAN-90&D=0&P=19277 https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1304&L=COMP-FORTRAN-90&D=0&P=20030 https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1304&L=COMP-FORTRAN-90&D=0&P=20905 https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1304&L=COMP-FORTRAN-90&D=0&P=21662 https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1304&L=COMP-FORTRAN-90&D=0&P=22596 https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1304&L=COMP-FORTRAN-90&D=0&P=23429 https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1304&L=COMP-FORTRAN-90&D=0&P=24008 https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1304&L=COMP-FORTRAN-90&D=0&P=24879 https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1304&L=COMP-FORTRAN-90&D=0&P=25654 https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1304&L=COMP-FORTRAN-90&D=0&P=26570 https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1304&L=COMP-FORTRAN-90&D=0&P=27946 https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1304&L=COMP-FORTRAN-90&D=0&P=28555 https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1304&L=COMP-FORTRAN-90&D=0&P=28767 https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1304&L=COMP-FORTRAN-90&D=0&P=29559 https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1304&L=COMP-FORTRAN-90&D=0&P=30358 https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1304&L=COMP-FORTRAN-90&D=0&P=31147 https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1304&L=COMP-FORTRAN-90&D=0&P=32768 https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1304&L=COMP-FORTRAN-90&D=0&P=33612 https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1304&L=COMP-FORTRAN-90&D=0&P=34260 https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1304&L=COMP-FORTRAN-90&D=0&P=35079 https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1304&L=COMP-FORTRAN-90&D=0&P=35663 https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1304&L=COMP-FORTRAN-90&D=0&P=36599 https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1304&L=COMP-FORTRAN-90&D=0&P=37105 https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1304&L=COMP-FORTRAN-90&D=0&P=37938 https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1304&L=COMP-FORTRAN-90&D=0&P=38784 https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1304&L=COMP-FORTRAN-90&D=0&P=39675 https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1304&L=COMP-FORTRAN-90&D=0&P=40553 https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1304&L=COMP-FORTRAN-90&D=0&P=41513 https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1304&L=COMP-FORTRAN-90&D=0&P=42275 https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1304&L=COMP-FORTRAN-90&D=0&P=43064

ivan-pi commented 4 years ago

Personally, I like options 1a and 1c.

certik commented 4 years ago

@milancurcic in that blog post you linked to, Steve Lionel is pretty clear that he recommends not to use the constants from iso_fortran_env.

The safest is to define a module, where we define our our constants (option a.), we can call them real64 if you like that name (so option 2 above --- although as I conclude below, I don't like that name at all), but we ensure that they are the right kind on all compilers and platforms. As indicated, the real128 didn't actually provide the right kind on all compilers. That issue is completely avoided if we introduce a module with our own real64 variable. Because we can then easily ensure that it is declared properly to work in all compilers. How it is defined inside our stdlib_types module is another issue, I don't really care about that.

More generally, tying double and quadruple precision to bytes is artificial. The same with real16 and bfloat16 (also 16bit, but different than real16).

Rather, I recommend we provide our numerical code for single, double and quadruple precision (perhaps also half precision and/or bfloat16 later if those types become available) and that we control those kinds and declare them in some module. So option b.

Regarding that it is already in the standard: we are designing a standard library. So we can use any of the options above I think if we determine that they are better than what is already in the standard.

I think we'll not be able to resolve this quickly. But I do want to move forward. One compromise that would be ok with me right away is: 1b, that is:

use iso_fortran_env, only: sp=>real32, dp=>real64, qp=>real128

and get rid of the stdlib_types module. That still allows us to easily move the definition to a separate module if needed, without having to rewrite all the code --- as I said, I don't care that much how it is defined, as definitions can be changed relatively easily, but I do care a lot how the actual code that defines real variables looks like. @milancurcic indicated he would be ok with it. @jvdp1 what do you think?

rweed commented 4 years ago

This is one of things I was trying to address with issue #13. I commonly use either RK or WP for the default real kind but am moving to use just WP since that is the most commonly used name I see in a lot of other projects (LAPACK90 etc. I think). My position is we should use a multi-"layer" approach where you define (as I do in #13) the equivalent of REAL64 etc as the base names, shorter aliaises for the specific types (DP, QP etc) and then a default type name (WP etc).

marshallward commented 4 years ago

I'd suggest avoiding some of the simpler two-character names like sp or rk for the kind handles, since it would be best to preserve the namespace for users as much as possible. The verbosity does not bother me, and as @rweed suggests one can always define a "working precision" wp for their own code.

One possible solution to management of kinds is to try and identify the available data types which have known hardware implementations, or at least have been defined in some sense (IEEE 754), and then name the kinds after these well-defined formats. The library would then map the kind to the type, and the user could invoke the type unambiguously.

For example, descriptive names like float and double (or float32 and float64) could be explicitly assigned to their IEEE types. float80 could explicitly map to the x86 80-bit float, doubledouble onto the IBM XL quad type, and so on. The build system would determine which ones exist and which kind to assign to them. (Missing ones could be set to a defined MISSING_KIND)

This would seem to me to avoid the issues with real128, since "16-byte float" was not well-defined for a long time. By addressing this at library build time, we would never have to deal with this ambiguity.

This is not something that can be done in the language standard, since it strives to be hardware-agnostic, and a user without stdlib would have to resort to selected_real_kind or kind(0.0) checks. But it may make more sense in a library whose parameters are set at build time.

And if there's any types that I have missed (float8 or a decimal float, for example), then they could be added to a custom implementation of stdlib, with the hope that the label would eventually be integrated into the reference library.

jvdp1 commented 4 years ago

I think we'll not be able to resolve this quickly. But I do want to move forward. One compromise that would be ok with me right away is: 1b, that is:

use iso_fortran_env, only: sp=>real32, dp=>real64, qp=>real128 and get rid of the stdlib_types module. That still allows us to easily move the definition to a separate module if needed, without having to rewrite all the code --- as I said, I don't care that much how it is defined, as definitions can be changed relatively easily, but I do care a lot how the actual code that defines real variables looks like. @milancurcic indicated he would be ok with it. @jvdp1 what do you think?

@certik I am ok with your proposition. It would allow us to move forward in the different modules, while discussing what is the best solution for everyone, and maybe defining a stdlib_types.f90 that could be used across all modules.

certik commented 4 years ago

@jvdp1 thank you. I'll update the PR #23.

(Btw, we should probably call it stdlib_kinds.f90 instead of stdlib_types.f90 if we introduce the module later.)

@marshallward in order to move #23 forward, what do you suggest we do? What names should we use for the kind variables? It seems most people would be ok with sp, dp and qp. Your objection:

I'd suggest avoiding some of the simpler two-character names like sp or rk for the kind handles, since it would be best to preserve the namespace for users as much as possible.

The sp, dp and qp names will be used internally in stdlib. User code is free to use other kinds variables if they wish. So it would not clash. Can you elaborate on this concern a little bit?

marshallward commented 4 years ago

If they're only used internally then there's no issue. I think I misunderstood the issue and had thought that the library would be providing types for users.

Within the library there's no objection from me.

certik commented 4 years ago

Note: I am going to wait until the list at https://github.com/fortran-lang/stdlib/wiki/List-of-popular-open-source-Fortran-projects is reasonably complete, and then I'll update the information below.

I'll use this comment to collect the usage in codes. I put the number of GitHub stars in parentheses.

Codes that I know about off top of my head:

Codes found using this search query at GitHub (for some reason this query does not show any of the highly starred projects above...):

I then tried this query:

For some reason the GitHub queries do not return highly starred projects... So I started #28 to ensure the above analysis captures all the popular projects.

certik commented 4 years ago

@marshallward initially it would provide these types to users, but above we decided on a compromise that for now it would be internal to stdlib, until we can agree on a solution.

scivision commented 4 years ago

another factor to consider is that some compilers including GCC 9 can be buggy with quad-precision. Sometimes compile/link errors, sometimes runtime errors. I would consider making real128 be optionally included perhaps as a submodule for everyplace real128 might be a desired kind. The option to include could be manual or a build system check check_fortran_source_runs()

also consider for long term, growing momentum of half-precision float16 in general, including Fortran e.g. NAG. So having a modular approach to kinds may be useful and not burdensome.

zbeekman commented 4 years ago

This is certainly a complicated issue.

One of the wonderful things about Jin2For is it queries the kind parameter arrays provided by iso_fortran_env so you can implement an interface for all available intrinsic kinds. This is very nice from a client-code usability perspective: Call a procedure in stdlib that takes any intrinsic kind and it will work (baring bugs, such as those mentioned above by @scivision. The downside is that users either need to have and run Jin2For locally to generate code that matches their compiler, or we would need to provide a pre-processed source distribution for each compiler.

Because of these limitations in Jin2For, I'd prefer to see kinds supported that correspond to standardized floating point (and integer) formats with standardized ABIs like IEEE 754. This way they should exist across all compilers and be more interoperable with C and other languages. My inclination is to go with @certik's proposed 1-3 or 7 but use option c: iso_c_binding to declare the types in hopes that they exist and match the standardized floating point model the programmer is thinking of. But this may also be the case with approaches "a" and "b".

I also note that care must be taken with approach "a", i.e., selected_real_kind(32) since kinds like the 80-bit float might be present that you were not expecting. For example, if one knows that double precision (IEEE 754) has a precision() of 15, then one might assume the next available real kind must be quad precision and request selected_real_kind(precision(1.0D0) + 1), which, for GFortran, would return a real kind of 10, which appears to be the x86 80-bit real, and is likely not what the naive programmer intended.

scivision commented 4 years ago

In my opinion option (c) using iso_c_binding would be necessary to work with many real-world projects that interface with C or use C interfaces. To keep relevance and growth of Fortran I think this is necessary. Almost every project I work with needs to have a C-like interface, and that requires iso_c_binding in my opinion. I think iso_fortran_env real32/64 int32/64 is more aesthetically appealing, but I suppose if we're going to be motivated by interfaces to C than maybe it's best to go all with the way with iso_c_binding instead I think there are good and valid reasons for the other methods too in a pure-Fortran environment, but I think there are too many important real-world use cases where there must be an interface to other languages, whether hidden or visible to the user (e.g. libraries that internally use C interfaces, or those whose ABI explicitly uses C interfaces)

milancurcic commented 4 years ago

In my opinion option (c) using iso_c_binding would be necessary to work with many real-world projects that interface with C or use C interfaces. To keep relevance and growth of Fortran I think this is necessary. Almost every project I work with needs to have a C-like interface, and that requires iso_c_binding in my opinion. I think iso_fortran_env real32/64 int32/64 is more aesthetically appealing, but I suppose if we're going to be motivated by interfaces to C than maybe it's best to go all with the way with iso_c_binding instead

Hmm, I haven't considered this but I think it's a good point and I agree with it. If this would help non-Fortran projects to more easily interface Fortran stdlib (I think so), we should seriously consider this.

certik commented 4 years ago

Would it make sense to put the kinds variables (say sp, dp, qp or whatever we decide) into a module stdlib_kinds.f90, and inside the module we can for example use iso_c_binding to define them? That way if the definition had to be changed for some reason, only one module stdlib_kinds.f90 has to change, the rest of the code stays the same.

jvdp1 commented 4 years ago

Would it make sense to put the kinds variables (say sp, dp, qp or whatever we decide) into a module stdlib_kinds.f90, and inside the module we can for example use iso_c_binding to define them? That way if the definition had to be changed for some reason, only one module stdlib_kinds.f90 has to change, the rest of the code stays the same.

If we are going to use iso_c_binding, then I think it would make sense indeed. A module stdlib_kind.f90 would replace iso_fortran_env for the different kinds. Therefore such a module should also define the different kinds for integers, to avoid something like:

use iso_fortran_enc, only: int32,int64
use stdlib_kinds, only: sp, dp, qp
nshaffer commented 4 years ago

Let's not forget complex kinds! (and character kinds, but that warrants separate discussion)

zbeekman commented 4 years ago

Let's not forget complex kinds! (and character kinds, but that warrants separate discussion)

I totally agree. I would guess that most people are implicitly considering complex kinds when discussing reals because the available kinds for complex are the same as for reals, with the same properties (range, precision, epsilon, tiny, huge, etc.)

As for character kinds, that's a trickier issue.

certik commented 4 years ago

I implemented this latest idea of a stdlib_kinds module using iso_c_bindings at #63.