ParaToolsInc / taucmdr

Performance engineering for the rest of us.
http://www.taucommander.com
Other
29 stars 11 forks source link

Fall through PDT parsers #203

Open jlinford opened 7 years ago

jlinford commented 7 years ago

The TAU Performance System is supposed to try all source code parsers before falling back to compiler-based instrumentation, but in practice it usually segfaults before it makes it through the list. TAU Commander could take over this behavior via export TAU_OPTIONS='-optPdtF90Parser=PARSER' for all possible PARSERs.

zbeekman commented 6 years ago

General comments on parsers and correct handling of Fortran

Some comments on my experience so far with F90 parsers, and considerations for correctly handling parsing of Fortran code

  1. Modern Fortran almost always includes the use of modules. For whatever reason (I do not believe that this is imposed by the standard) source files defining a module produce "binary" .mod files corresponding to each module.
  2. .mod files are NOT portable, even between different versions of the same compiler vendor!!! .mod files produced with GFortran 4.4 likely will not work with GFortran 4.8 etc.
  3. .mod files corresponding to any used modules must be present at compile/parse time!
  4. use mpi (provided by MPI) and use iso_c_binding (intrinsic) can cause the parser to choke because it is too old (iso_c_binding) or because MPI has not been compiled/parsed to produce a .mod file by the parser in question.
  5. gfparse seems much more robust than gfparse48
  6. The whole notion of fallback parsers is potentially flawed thanks to the lack of portability of .mod files, unless you can try a parser for the entire project and then if it fails try the next parser for the entire project. There is little hope of mixing parsers when .mod files are in play. It will likely be OK---or at least more likely to work---for older Fortran, Fortran 77 and earlier.

Partial fixes I have undertaken upstream

Improvements I have undertaken upstream in TAU & PDT:

  1. To address the lack of portability, and the requirement that dependency .mod files are present for parsing to succeed the PDT_MOD_DIR needs to be namespaced by parser. I.e. the Fortran parsers should put .mod files in a private subdirectory of the temporary directory. This way they are not clobbered by a fallback parser, and the parsers won't try to mix .mod files created by different parsers.
  2. When a selective instrumentation file uses a source file whitelist via BEGIN_FILE_INCLUDE_LIST TAU must still run the parser over any file declaring a module because the parser may need the corresponding .mod file to parse a file declared in the FILE_INCLUDE_LIST.
  3. If a user passes -optPdtF90parser= to TAU then it probably should not try to fallback on other parsers, since the user explicitly requested a certain parser.

TAU Commander specific discussion/thoughts/recommendations

For fallback parsing to succeed, we really need to have all the parsers run on all the source files defining a module. Otherwise it is extremely likely that the fallback parser won't have the prerequisite .mod files it requires, or the original parser won't have .mod files it needs since some where generated by a fallback parser.

Unless you run make all once for each parser on your entire project, and can somehow mix the correct .pdb files generated by each pass, then I don't see a good way to implement fallback parsing, other than running every parser on every Fortran source file that declares a .mod file.

I have had pretty good luck with gfparse and I think it is worth considering making this PDT/TAU's default go to parser over gfparse48.

zbeekman commented 6 years ago

The more I think about this, the more I think that one or both of the following steps are required to resolve this in a robust way:

  1. Significant robustness improvements need to be made upstream (TAU & PDT)
  2. The parser should be an application property

In general, due to Fortran 90 .mod files introducing compile/parse time dependencies, you can't simply hot swap parsers for different source files that are part of the same code. I'd be curious what others think. (CC: @jlinford @khsa1 @nchaimov )