Closed milancurcic closed 3 years ago
Actually I propose 3c).
c) The git repository contains the templated code, depends on a 3rd party tool, and does not contain any autogenerated files. Then we created release tarballs automatically on a CI. A release tarball contains all the necessary generated files and the only dependencies are cmake (or make) and a Fortran compiler, and does not contain the git history, the templated files, nor any CI files and other things that are not needed to actually build the library. Users, as well as distributions (Debian, Ubuntu, Homebrew, Conda, Spack, etc.) only use the tarball, not the git repository.
I follow exactly this approach with LFortran, and it seems to work great. The advantage is that the git repository does not have autogenerated files, which greatly simplifies PRs (a simple diff versus hundreds of lines of modified autogenerated files) and makes it obvious how things should be modified --- so that people who want to contribute do not accidentally modify the autogenerated files, or forget to generate the files. Rather, the files are automatically generated using a CI, so they are always generated correctly.
Great! I didn't think of this and indeed it seems to me like the best way to go.
The best templating tool I have found is Jin2For. It uses Jinja2 templating which people may be familiar with from web technology stuff and is the templating back end for FORD. These are Python based, which is likely a language that Fortran developers may be familiar with. It can auto-generate default type aliases, kind info, and declarations by querying the compiler and enumerating ISO_Fortran_ENV
's real_kinds
, integer_kinds
, logical_kinds
and character_kinds
array.
By providing a generic implementation for each intrinsic kind (where it is sensible to do this) the user doesn't need to care about setting dp, sp, rk, or whatever other convention you have for selecting kinds. Things just work™️. The downside is that compilers are not required to support all kinds, so you end up generating code specific to the kinds that a given compiler supports. This is not necessarily a bad thing, but it means that you may want to distribute different source versions tailored to different compilers. Since Fortran doesn't have a standard/interoperable ABI this is not really an issue at all, IMO.
I reviewed jin2for and I like its simplicity (a minimal and to the point tool) and the fact that it uses an existing templating language rather than inventing a new one. I think it's a good candidate.
Let's try to use jin2for
and see how it goes.
In general, I think the Fortran language itself should make it easier to write subroutines that operate on different kinds. This is something I would love to experiment with in LFortran in the future, and using jin2for
is a solid starting point -- the future goal would be to simplify the syntax using (future) Fortran features.
I gave a try to jin2for
with loadtxt/savetxt (see https://github.com/jvdp1/stdlib/blob/loadtxt_autogen/src/stdlib_experimental_io.F90 and other tests/loadtxt/*.F90
files).
I am not sure how it should be done with cases that involve both integer and real kinds. Should the pre-defined templates be used? If yes, how to use kinds defined as sp
, dp
, qp
?
Anyway, jin2for
seems to be a nice and useful tool, and the option 3c proposed by @certik seems to be a good approach (not implement in my branch).
What are the disadvantages of using CPP for this? I am worried about deepening the necessity on external tools, which can hinder portability.
CPP is always almost always available and often baked into the compiler (I think it's literally a library inside gfortran). Another advantage of CPP is that the compiler is often aware of the step, and debugging can point directly to the template file, rather then a copy placed in some scratch directory for which the user is unaware.
We've used it in FMS for this task without much issue. Readability and debugging are the only major drawbacks, but this would be true fur any templating approach.
(I'm on the road right now but can supplement with links when I get a chance.)
As far as I understand CPP is more limited in what can be done with it. I'm surprised that you could do this with CPP alone.
In the scenario 3c, the external tool is required only of stdlib developers and not of end users, so I don't see much of a portability issue.
This post outlines what we can do with FMS with CPP templating:
https://github.com/j3-fortran/fortran_proposals/issues/4#issuecomment-544190322
Thanks @marshallward, I just read the comment and the sources you linked and I agree, it does seem quite bloated and is likely to get more complicated when considering different combinations of argument types and kinds.
I think this illustrates well the downsides -- with CPP we can't loop, but only define/undefine macros and branch. There may be more esoteric stuff to it, but this is what I've seen.
@jvdp1 I looked at your templates and the code is quite clean and readable to my eyes. I like it. We probably shouldn't use the .F90 suffix here -- .F90 is still a valid Fortran source file, whereas these templates aren't. I think jin2for suggests .t90 suffix for templates.
I tried to implement the IO module with CPP template (see https://github.com/jvdp1/stdlib/tree/loadtxt_cpp/src ). Honestly I was easier for me to use CPP (it passed the CI) than jin2for. I could also extend the IO module to integers using CPP. For these simple subroutines, using CPP is easy to implement. But using CPP could become quite difficult when combining multiple options.
@jvdp1 I looked at your templates and the code is quite clean and readable to my eyes. I like it. We probably shouldn't use the .F90 suffix here -- .F90 is still a valid Fortran source file, whereas these templates aren't. I think jin2for suggests .t90 suffix for templates.
@milancurcic I followed the syntax implemented by @zbeekman in one of his libraries. I agree that the .t90
suffix might be better.
@jvdp1 thanks a lot for implementing both approaches. Here they are, side by side:
CPP
: https://github.com/jvdp1/stdlib/blob/7f246a2b75ed0e6f584e2f820776cf80530dd8e6/src/stdlib_experimental_io.F90 (165 lines + 42 lines in loadtxt.inc
and 25 lines in savetxt.inc
, total of 232 lines)jin2for
: https://github.com/jvdp1/stdlib/blob/0ee94604f5b39283f6054d23be00547f4eaec51a/src/stdlib_experimental_io.F90 (149 lines)It seems the jin2for
version is a lot shorter. Am I right? Was it more difficult to implement because it is new, but as we (now) have an example how to use it, it will be perhaps even easier than CPP
?
@certik jin2for
is indeed less verbose, and I think less error prone in this example.
jin2for
was more difficult because it was new (e.g., I couldn't extend the subroutines to support integers (as I did with CPP
), but I didn't try hard to find the solution; @zbeekman may have some hints).
Both approaches have pros and cons (e.g., CPP
passed the CI without any change to it, while it was not the case for jin2for
).
@jvdp1 we have to update our CI to support jin2for
obviously. I would not hold it against it. :)
I can see the advantages of jin2for
, most notably iteration, and think option 3c addresses my concerns about portability. I also agree that the files should use the t90
suffix, or at least not [fF].90
.
I had resisted Jinja2 integration in another project, because I was concerned that the Jinja2 tokens may clash with the native file's own tokenisation (usually various config files); Jinja2's syntax was designed to safely work with HTML and not much else. But I also wondered if I was being too conservative.
Are there any known limitations to using Jinja2 on Fortran markup, such as token mix ups?
Does Fortran use {
for anything? The combination {%
is almost for sure safe. And if there is some possibility of a clash, I think we can tackle it on a case by case basis by rewriting things appropriately.
Down the road, I would like to prototype some of this limited templated functionality into LFortran and then propose it for the Fortran language itself. jin2for
is a good start, as the code looks pretty nice. If Fortran language was extended, then the syntax would get even better perhaps. And LFortran could in the future be used instead of jin2for
to do the rewrite, until all compilers support it.
I was really hoping that lexical macro processing would get re-introduced into this upcoming Fortran standard. In fact, a fleshed-out macro processing specification was already given in Fortran 2008 drafts (e.g., https://j3-fortran.org/doc/year/07/07-007.pdf) but was dropped then. It was also considered for the upcoming standard as a means of supporting generic programming. But we know how that went.
After templating/macro processing was forgone for F2020, I looked into m4
as a solution for my own generic-interface-producing needs, since it is the tool gfortran uses to generate specific implementations of generic intrinsics. Its strengths are its power and that is a standard POSIX utility. It has been fun to learn, but I do not think it is a good solution for a standard library. POSIX standardization is not enough to compensate for the fact that it's really hard for a 21st century programmer to grok and I think it will lead to heavy technical debt. The other downside is that it's hard to make m4
programs look like marked-up Fortran source, so that it could be tricky to "port" the m4
workflow to a hypothetical standard macro/templating scheme (assuming J3 ever produces one).
I slightly modified the CPP
implementation (https://github.com/jvdp1/stdlib/tree/loadtxt_cpp) to clarify a few things, and renamed the files .F90
to .t90
(https://github.com/jvdp1/stdlib/tree/loadtxt_autogen).
These 2 options seem to be the most acceptable among all proposed. Should we make a choice now?
One thing I don't understand about the cpp
approach shown is what it does better than, e.g.,
module ex
use iso_fortran_env, only: real32, real64, int32, int64
implicit none
interface foo
module procedure foo_real32
module procedure foo_real64
module procedure foo_int32
module procedure foo_int64
end interface foo
contains
function foo(x) result(y)
real(real32), intent(in) :: x
real(real32) :: y
include "foo.inc"
end function foo
! etc.
end module ex
The main downside of this approach is the repetition of each subroutine "skeleton" and the need to manually populate the interface blocks, but the cpp
example has those same issues at the cost of introducing a foreign (albeit well-supported) program. I see cpp
as having the worst of both worlds: it's an external tool, but it's not significantly more powerful (in this application) compared with the technique above. Of course, this assessment is invalid if I've overlooked some cpp
technique that's not used in the examples posted so far.
One thing I don't understand about the
cpp
approach shown is what it does better than, e.g.,
With this proposed scenario, 1 file per skeleton would be needed (if I understand well your proposition), while with the CPP
approach, all skeletons could be included in 1 same file. CPP
is an external tool, but it is well supported by most compilers. However, I don't appreciate the use of additional .inc
files in both approaches.
The jin2for
approach also requires an external tool. While no additional files were used in my example, jin2for
might require them for more complex implementations.
However, if we use a strategy as described by @certik where end users and distributions only use tarballs automatically generated by a CI, using an external tool should not be a problem.
One thing I don't understand about the
cpp
approach shown is what it does better than, e.g.,With this proposed scenario, 1 file per skeleton would be needed (if I understand well your proposition), while with the
CPP
approach, all skeletons could be included in 1 same file.CPP
is an external tool, but it is well supported by most compilers. However, I don't appreciate the use of additional.inc
files in both approaches.I didn't show it, but you would
include "foo.inc"
for each type you want to implement. This works as long as the contents of "foo.inc" are actually type-generic. I think this is totally equivalent to thecpp
approach. I also dislike the disembodied ".inc" files, but it is the most economical approach the standard gives us right now. Thejin2for
approach also requires an external tool. While no additional files were used in my example,jin2for
might require them for more complex implementations.However, if we use a strategy as described by @certik where end users and distributions only use tarballs automatically generated by a CI, using an external tool should not be a problem. Agreed. This is the best-sounding approach, provided we trust the external tool will continue to be maintained until we have proper generics facilities in the standard and in compilers.
I haven't gone through this thread in the detail it deserves yet, but a few broad observations:
The biggest problem with the automatically generated types is that they are very non-portable: They basically interrogate the available kinds from iso_fortran_env
and then just blindly use them. So using the default aliases & kinds provided by this and generating them from GFortran may (almost certainly will, but I have yet to confirm) generate code that can't run with Intel's ifort
.
My personal preference is to use Jin2For, but don't use the built in type declarations, aliases and kinds that are created from compiler introspection. Instead, for reals at least, attempt to target single, double and quad precision. CMake introspection can be used to confirm which kinds exist for a given compiler and then Jin2For can be used to generate interfaces and implementations for each kind supported by the compiler.
Otherwise, if you use the built in t.decl
, t.alias
, t.kind
macros you will be generating code specific to the compiler being used that won't be portable.
Since the existence of various kinds is not guaranteed by the standard, much less the integer associated with each kind, this is a rather sticky situation. But I would rather loop over a list of kinds (possibly generated from CMake introspection) using Jin2For templates than contend with the awkward square peg in a round hole that is CPP and other non-standardized pre-processing. The advantages of Jin2For (or Jinja2 really...) are its widespread use in other domains (so that it is battle hardened and good enough to be popular) and the fact that it's Python based and extensible.
But, by preprocessing the code for the end user so they don't need Jin2For (unless we want to provide different pre-processed code for different compilers) you lose a non-trivial quantity of its utility. Whereas if you can stick to standardized fortran and a subset of cpp/fpp that's implemented in all major compilers then the user can do the code pre-processing themselves at configure/build time.
The ideal solution for this would be https://github.com/j3-fortran/fortran_proposals/issues/128 in my opinion. But we'll have to probably wait some time for that.
As an alternative to Jin2For, you may also consider the Fypp preprocessor for generating templates. (Disclaimer: I am the main author of Fypp). It has similar loops as Jin2For and additionally also offers macros, so it could be also used for the assert macros (#72). It consists of a single (Python) source file and can be, therefore, easily shipped with the library, so that the build only requires a standard Python (2.6, 2.7 or 3.x) installation.
I like fypp a lot. Having the author in Fortran community is a huge plus IMO.
What do you think about taking a minimal example function and comparing the jin2for and fypp syntax next to each other? For example:
integer function sum(a, b)
integer, intent(in) :: a
integer, intent(in) :: b
sum = a + b
end function sum
Requirements:
a
and b
can be any of real(sp)
, real(dp)
, real(qp)
, integer(int8)
, integer(int16)
, integer(int32)
, integer(int64)
, as defined in stdlib_experimental_kinds.f90;a
and b
.The preprocessed source code would result in 49 specific functions.
What would the template look like with jin2for and fypp? What would the invocation look like? Let's compare them side by side.
Here's a go at using fypp
for the task. I used it in some personal projects a couple years back, but I only really know the basics. This is a pretty naive implementation. Gist
The only snag I ran into was fypp
not liking constructions like
#:for i, (k, t) in enumerate(zip(KINDS, TYPES))
...
#:endfor
Not sure if that's a bug or if fypp
just doesn't do multi-level tuple unpacking. It's easy to work around in this case, at least.
@nshaffer fypp currently does not support multi-level tuple unpacking due to technical reasons. It could be extended if this really made a huge difference in the user experience. (As it is a preprocessor, I try to keep it as simple as possible to prevent people doing something with it, which they should do in Fortran instead :wink:)
You example is very neat. In some cases we may also need to loop over ranks. It would then need a simple macro and an additional loop more:
#:def ranksuffix(rank)
#{if rank > 0}#(${":" + ",:" * (rank - 1)}$)#{endif}#
#:enddef
#:set ranks = range(6)
...
#:for rank in ranks
some_type, ... :: a${ranksuffix(rank)}$
#:endfor
@aradi Cool, thanks for confirming that about tuple unpacking. I think it's not a problem overall, I just let my Python instincts take the reins. The less-fancy version I arrived at is arguably better.
@aradi @nshaffer Thank you for the examples with fypp
.
fypp
seems quite flexible, and I like @aradi 's example with loops over ranks (we may actually need that for functions like mean(array)
, variance(array)
, ... :) ). So I think this feature has to be considered seriously. Such a feature may be tedious to implement with cpp
(and I don't know if it would be possible with jin2for
).
An additional advantage is that the author @aradi is involved in the Fortran community.
fypp is interesting. Knew it existed but had not tried it. I have used m4 for that type of thing and just never liked the syntax. So I have used the bash shell and "here" documents. I made make(1) rules for the suffix ".shf" that say to execute the file as a bash shell and use the standard output to make a *.f90 file. Works great for me, and bash has all the looping and access to output from commands and can query the system type and everything itself Since bash is readily available and I am quite familiar with it that has worked great for me, although I doubt I will get any converts by mentioning it. Anyway maybe I will quit doing that after giving fypp a better look. Thanks for the enticing examples
On January 13, 2020 at 1:54 PM Jeremie Vandenplas notifications@github.com wrote:
@aradi https://github.com/aradi @nshaffer https://github.com/nshaffer Thank you for the examples with fypp. fypp seems quite flexible, and I like @aradi https://github.com/aradi 's example with loops over ranks (we may actually need that for functions like mean(array), variance(array), ... :) ). So I think this feature has to be considered seriously. Such a feature may be tedious to implement with cpp (and I don't know if it would be possible with jin2for). An additional advantage is that the author @aradi https://github.com/aradi is involved in the Fortran community. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/fortran-lang/stdlib/issues/35?email_source=notifications&email_token=AHDWN3OBAQFOPEHPDRCGEF3Q5S2GJA5CNFSM4J6MKIL2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIZ3S6A#issuecomment-573815160 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AHDWN3ODMUOOVUG37ODXOZDQ5S2GJANCNFSM4J6MKILQ .
Here fypp
is used to generate loadtxt
for all kinds, based on @nshaffer 's example.
Earlier I also did it with cpp
and jin2for
. Among the three implementations, fypp
is the most complete (i.e., implemented for all kinds for loadtxt
).
Without any research (I could have spent more time in jin2for
to extend it to integers, but it was not straightforward for me), fypp
was also the easiest one to use for me. cpp
may come tedious for more complex cases (e.g., for loops over ranks).
We've been using fypp for this and it's been working well enough.
This question comes up in #34 and elsewhere. How to implement specific procedures that work on different kinds (
sp
,dp
,qp
,int8
,int16
,int32
,int64
) as well as characters, where the body of the procedure is the same (can be copy/pasted entirely without breaking it). Let's first just focus on this scenario, and we can consider more complex cases later.I know of a few approaches:
Repeat the code, that is, implement all specific procedures explicitly. That's what I did in functional-fortran, see https://github.com/wavebitscientific/functional-fortran/blob/master/src/lib/mod_functional.f90. Repeating is fine if you do it once and forget about it. The upside is that you can see the specific code and it needs no extra tooling. The downside is combinatorial explosion if you have procedures that are to handle all combinations of types and kinds. Most procedures are rather simple (one or two arguments), and I ended up with > 3K lines of code for 23 generic procedures. Most work was in editing the argument types to specific procedures, and less work was in copy/pasting of the repeatable content. I don't recommend this approach for stdlib.
Approach 1 can be somewhat eased by explicitly typing out the interfaces, and using
#include 'procedure_body.inc'
, defined in a separate file. Then your procedure body collapses to one line. This reduces the total amount of code, but not so much the amount of work needed, as most work is in spelling out the interfaces. This approach still doesn't need extra tooling as a C preprocessor comes with all compilers that I'm aware of.Use a custom preprocessor or templating tool. For example, a function that returns a set of an array:
A template could look like this:
or similar, where the custom preprocessor would spit out specific procedures for all integer and real kinds. Some additional or alternative syntax would be needed if you wanted all combinations of type kinds between arguments.
There may be tools that do this already, and I think @zbeekman mentioned one that he uses. In general, for stdlib I think this is the way to go because we are likely to see many procedures that support multiple arguments with inter-compatible type kinds. The downside (strong downside IMO) is that we're likely to introduce a tool dependency that also depends on another language. If the community agrees, we can use this thread to review existing tools and which would be most fitting for stdlib.
Let's say we pick a tool to do the templating for us, we have two choices:
a) Have user build specifics from templates. In this scenario, the user must install the templating tool in order to build stdlib. I think we should avoid this. b) Use the templating tool as developers only, and maintain the pre-built specifics in the repo. This means that when we're adding new code that will work on many type kinds, we use the tool on our end to generate the source, and commit that source to the repo (alongside the templates in a separate, "for developers" directory).
Assuming we can find a fitting tool, I'm in favor of the 3b approach here. There may be other approaches I'm not aware of or forgot about. What do you think and any other ideas?