Preprocessor support - Githubissues

LKedward commented 4 years ago

This issue is to ask whether fpm will have any built-in support for preprocessing and how this might look.

I bring this up since I noticed that stdlib is listed in #17 under 'Pure Fortran', however it requires the fypp preprocessor to build from repo source.

everythingfunctional commented 4 years ago

We've had discussions about it. I believe we decided we want to settle on a preprocessor, and just always use it. I'd have to go find that discussion, but I believe we settled on fypp.

certik commented 4 years ago

Yes, we definitely want fpm to apply a preprocessor. We also talked about file extensions, and it seems most people would prefer to just stick with .f90, and fpm would apply the preprocessor appropriately (via a compiler option or otherwise).

We probably should support both cpp and fypp. For fypp we should further provide a fast C++ implementation, so that we don't have to depend on Python.

milancurcic commented 4 years ago

:+1: on both cpp and fypp. cpp is de facto standard and many Fortran projects rely on it. fypp we have established earlier that it is more powerful than cpp and thus useful for generation of specific procedures like those in stdlib.

For fypp we should further provide a fast C++ implementation, so that we don't have to depend on Python.

I agree, although this is a non-issue until fpm itself is Fortran or C++. Python ships out of the box on most systems.

certik commented 4 years ago

I agree, although this is a non-issue until fpm itself is Fortran or C++.

Actually it's an issue for distributing fpm, as we cannot easily integrate fypp into the fpm single binary, so we now have to ship it along side fpm somehow, etc.

However, since we will eventually use Conda for the non Fortran dependencies, then fypp can just be installed using Conda / Mamba, and then indeed it should become a non-issue.

urbanjost commented 3 years ago

So is there any interest in a Fortran-based version of fpp? Without macro support I think it would be simple to convert pfpp to look like fpp(1), albeit there are varients of fpp(1) to be sure. Adding macros would be considerably more effort. Adding basic templating based on looping or cacheing and substitution would probably be easier than macros. It would not be as powerful as some but not looking to recreate m4(1). But for basic if/else/elseif/endif preprocessing it works well. Combined with fpm setting a few variables in a standard way like OSTYPE and COMPILER and COMPILER_VERSION it could ship with fpm and only require a Fortran compiler. I am thinking it would just handle user-end pre-processing, not expand the code on the backend so it would handle "cpp"-like functions and assume the files were already expanded via fypp or whatever before-hand so that all that remained were .F and .F90 files with #if/#else/#elif/#endif/#ifdef/#ifndef directives.

ivan-pi commented 3 years ago

I definitely think having a Fortran based preprocessor that can be tweaked as desired is appealing. Have you compared your preprocessor to the list of behaviors compiled by the flang developers: https://github.com/llvm/llvm-project/blob/master/flang/docs/Preprocessing.md?

If anyone is interested I also uploaded the Sun Microsystems Fortran preprocessor to GitHub: https://github.com/ivan-pi/fdfpp The original code can be found on netlib. I have made no attempts to run it.

urbanjost commented 3 years ago

Making pfpp support macros and some of the other issues there is more effort than it is worth without stronger interest I suppose. Interesting list. The main problem seemed to be to emulate cpp without the issues that cpp has, and then some added a few features to fix other issues with pre-processing Fortran like resultant line length and others did not and so on. I had run across a few issues with fpp commands not being the same but did not know it varied that much. Looks like I would have hit more if I had not started using my own.

LKedward commented 3 years ago

+1 for a Fortran-based preprocessor, however as pointed out by Ivan, there is notoriously poor portability due to variation among Fortran preprocessor implementations. This is the primary reason for me avoiding preprocessors with Fortran. With that said I am quite impressed by what can be achieved with fypp.

I wonder whether there is interest in developing a community-agreed standard for preprocessing Fortran. Failure of previous standardisation efforts shouldn't preclude this.

ivan-pi commented 3 years ago

I am also quite satisfied by what can be done with fypp (even if the syntax is quite verbose). However, I am not sure if Ondrej's suggestion to have a Fortran / C++ version of fypp is feasible. At least some elements of the fypp preprocessor language are tightly coupled to Python, e.g. one can call Python functions directly.

Certainly, fpm is an excellent place to experiment with new preprocessing constructs. As long as the intermediate (standard-conforming) ".f90" files are recoverable one can use it portably across compilers. On a personal level however, I will rather direct my efforts in other areas first.

ivan-pi commented 3 years ago

Could a plugin based approach like suggested in #211 also be used to implement support of various preprocessors (this issue, #308, #469)?

A list of Fortran preprocessors can be found at the Fortran Wiki. My feeling is that overall the opinions are too divided to settle on any single one of them.

The cpp/fpp preprocessors which are currently the de-facto standard are too limited for some tasks, but still used heavily in practice. The younger "tech-savvy" Fortran users prefer fypp as a more complete solution. On the other hand, in recent discussions at Discourse several users expressed their irritation about having to install Python to use fypp.

Preprocessor plugins could be offered as fpm dependency packages. Then everyone could just use the preprocessor he likes best.

The Intel Fortran compiler offers something similar with the -fpp-name compiler flag. This allows user to supply their own custom preprocessors, by using a command like this (I haven't tested it):

ifort -fpp-name=fypp -Qoption,fpp,"--line-length=132" [..]

where the quoted text are the options passed to fypp. The -D (define) and -I (include directory) options are forwarded automatically. The compiler then spawns a sub-command with the following signature:

fypp [[-D<define>]..] [[-I<include>]..] --line-length=132 [..]

Output from the preprocessor goes to stdout and is captured for any further processing.

I know @certik has suggested in a few places we embed a preprocessor with fpm, but I was wondering lately whether an extensible plugin model fits the diversity of Fortran preprocessing needs better. One negative aspect of this approach, is that instead of joining forces to build a single reliable preprocessor that covers most needs, we'd essentially remain with the current pool of (flawed) tools (no offense meant to Fortran preprocessor developers).

awvwgk commented 3 years ago

We should implement a general preprocessing stage in fpm, this would allow native support of custom preprocessor as well. This requires a separate target processing stage before the source parsing to ensure we correctly generate the preprocessed source files as build artifacts. Otherwise our module dependencies might be inaccurate.

There is some demand on this beyond supporting fypp or other custom preprocessors. I recently talked with @robertrueger for supporting the FTL with fpm, there we have to deal with module names that are generated by the C preprocessor and therefore it currently can't be integrated with fpm.

Having a fypp-like preprocessor implemented in a compiled language is a secondary concern IMO, first we have to ensure that fpm is ready to use any preprocessor.

ivan-pi commented 3 years ago

I agree with the need for a general preprocessing stage. The recent post by @urbanjost at Discourse also supports a preprocessing stage which is agnostic to the actual tool used for the source transformation.

A few comments about preprocessing were left in #191. The two options discussed briefly were:

Should preprocessor invocation be tied to the source file extension (e.g. .F or .F90 will use the built in cpp/fpp preprocessor, .fypp gets preprocessed by fypp)?
Should preprocessor usage be mutually exclusive? (one preprocessor per project)

My own answer to both questions would be no; support of different file extensions and their meaning varies greatly between compilers (see #250). I can imagine cases where different preprocessors could be useful for different purposes.

What are the other dimensions/design questions that need to be considered?

In #250 (comment) separation between "built-in" and "external" preprocessors was suggested.
Should it be possible to chain preprocessors? (this doesn't work if the preprocessors share tokens)

certik commented 3 years ago

I agree not to tie it to the extension. Also, I saw one code at github that uses both fypp and the c preprocessor, so no to the second question as well.

The only issue is with location information and good error messages. I figured out that you can separate the preprocessor but you need to provide mappings, which are just a few 1D arrays that designate intervals in the original and preprocessed code, and you can have a few interval types.

The question then becomes, if fpm does the preprocessing, how can it communicate to the compiler so that the compiler returns good error messages? The #line directives is a good start, but the mapping above is better. We should hand over such mapping to compilers that support it (e.g., I am happy to accept such mapping in LFortran), and then the compiler can give good error messages.

Alternatively, the compiler can hand over the error messages to fpm for fpm to display, after remapping it itself. I don't know how much of this stuff fpm should be doing. It seems this is best left to the compiler to display error messages.

ivan-pi commented 3 years ago

Are you still able to find the code that uses both fypp and the built-in preprocessor? Maybe @aradi can comment on this? I kind of doubt there is something the C preprocessor can do that can't be achieved with fypp. But I can imagine some convoluted use of preprocessing preprocessor directives... My second guess would be a scientific code which grows organically over years/decades, and the developers don't have the resources (or need) to go back and refactor it to use a newer preprocessor.

I had not thought about the error issue so far. I kind of like the first idea, of passing the interval mapping to the compiler (maybe a scratch file is more suitable for large files). If the response to the LFortran prototype is good we can start lobbying other compilers to include a similar mechanism (or submit patches). Maybe we can brainstorm further in a issue at the LFortran repo?

certik commented 3 years ago

Sure, here is one example that uses both fypp and cpp: https://github.com/cp2k/dbcsr/blob/cfa17ba0f1c7c9c2ba83f5fa4358f2c9728aaa73/src/acc/dbcsr_acc_hostmem.F#L137

We should brainstorm how the compiler and fpm should cooperate. It's not just the preprocessor, but also the overall structure of the project and how exactly to compile it.

Currently fpm calls the compiler on a file by file basis. A better way is to give the compiler a list of files + compiler flags for each file. For small and medium size projects, the compiler can keep all code in memory and then provide a language server for an IDE, and then only recompile what was changed in the most efficient manner possible. Fpm is best to keep track of what files with what options (and which C preprocessor defines, to tie it to the present issue...), and to change these, for example the user selects a Release mode that can trigger a different C preprocessor option, or the user enables an optional dependency. Fpm should handle all that.

The actual compilation, on the other hand, fpm should only handle it if the compiler does not want to do it. But if the compiler wants to do it, fpm should just hand the information over.

Regarding preprocessing. A character changes in an IDE (e.g., VSCode). The file must be preprocessed and then the compiler must recompile and give code suggestions or type information back to the IDE.

It makes a lot of sense if the compiler (language server) just handles both, and fpm is only involved if the user explicitly calls it to change some configuration about the project. Thinking about it further, it seems the language server must handle both fpm and the compiler. If fpm handles preprocessing, the language server can first call fpm to preprocess, and hand over the code mapping to the compiler to take it from there. Compiler gives back info, typically with correct location information (using the mapping from fpm), say where to find the definition of a given symbol. The user must communicate with the language server somehow, I would imagine simply by calling fpm in the terminal, and the language server just must be notified somehow; if a cpp option changes, then syntax highlighting must change for the new enabled/disabled ifdef branches etc.; as well as location information of symbols change etc.

fortran-lang / fpm

Preprocessor support #78