Add script to check fortran vs metadata without a host model

peverwhee commented 7 months ago

Summary

Adds a new script that will compare all fortran and metadata files within a directory you provide (recursively).

Description

scripts/fortran_tools/offline_check_fortran_vs_metadata.py: the script in question. Surprisingly simple script that finds the metadata files in the directory provided and calls existing capgen functions to parse and compare to the relevant fortran files.
scripts/ccpp_capgen.py: updates to (1) also return an error if the fortran specifies that a variable is optional, but there is no corresponding "optional = true" metadata field for that variable and (2) allow a user to provide known_ddts to parse_scheme_files() so I don't have to worry about the constituent DDT in my offline script
test/capgen_test/temp_adjust.F90: above change broke test, removed optional attribute from errflg variable

User interface changes?: No

BUT: here's how you run the new script:

./offline_check_fortran_vs_metadata.py --directory <path_to_scheme_file_directory> (--debug)

Caveat(s)/Notes

One thing is that, as it stands, the fortran/metadata parsing and comparison functions raise exceptions within them, so you'll only get errors for one file (and often one issue) at a time if you're running this on a large swath of files. It's already on my to-do list to improve/streamline error handling during parsing/analysis, so that will help this in the future!

The script currently returns "All checks passed" if it gets through the parsing/comparison. This can be changed if needed!

climbfuji commented 7 months ago

Is this test supposed to pass on our test files? If I try it in capgen_test or advection_test, I get something like:
Unknown DDT type, physics_state, at test/advection_test/test_host_mod.meta:52
This seems suboptimal since that DDT is defined in the tested metadata files.

The idea was to be able to run this on any ccpp scheme without relying on host model information. Simply compare scheme metadata with Fortran code. Therefore it's not surprising to me that the DDT isn't found and this complains. As long as these expected "errors" don't halt the execution of the script I think this is fine (maybe we can suppress that - I haven't tried the script myself yet, but I will today because I need it for the ccpp_prebuild optional argument implementation PRs).

gold2718 commented 7 months ago

Is this test supposed to pass on our test files? If I try it in capgen_test or advection_test, I get something like:
Unknown DDT type, physics_state, at test/advection_test/test_host_mod.meta:52
This seems suboptimal since that DDT is defined in the tested metadata files.
The idea was to be able to run this on any ccpp scheme without relying on host model information. Simply compare scheme metadata with Fortran code. Therefore it's not surprising to me that the DDT isn't found and this complains. As long as these expected "errors" don't halt the execution of the script I think this is fine (maybe we can suppress that - I haven't tried the script myself yet, but I will today because I need it for the ccpp_prebuild optional argument implementation PRs).

I agree with the sentiment, however, in this case, the DDT is defined in the metadata that is being scanned, it was just not scanned before a file that used it. And after all, isn't host model metadata something that can be scanned with this tool? Given that this is not part of the mission of this script, I think it would be great to suppress warnings that are not relevant to the checks being performed.

dustinswales commented 7 months ago

@peverwhee This is great. It goes beyond comparing just the attributes in .Fortran/.meta files and does all of the "necessary Capgen compliance testing" (e.g. argument order checking, routines defined in both .F90/.meta, etc...). You could say this script is Capgen-lite, using just the scheme files. This would be a great thing to add to the ccpp-physics repo(s).

I ran this on the files in https://github.com/climbfuji/ccpp-physics/pull/19, and couldn't get to the part comparing the attributes w/o commenting out a couple lines in _check_fortran_againstmetadata and _compare_fheader_to_mheader, which is just because these scheme files still need some cleanup for Capgen.

peverwhee commented 7 months ago

Is this test supposed to pass on our test files? If I try it in capgen_test or advection_test, I get something like:
Unknown DDT type, physics_state, at test/advection_test/test_host_mod.meta:52
This seems suboptimal since that DDT is defined in the tested metadata files.
The idea was to be able to run this on any ccpp scheme without relying on host model information. Simply compare scheme metadata with Fortran code. Therefore it's not surprising to me that the DDT isn't found and this complains. As long as these expected "errors" don't halt the execution of the script I think this is fine (maybe we can suppress that - I haven't tried the script myself yet, but I will today because I need it for the ccpp_prebuild optional argument implementation PRs).
I agree with the sentiment, however, in this case, the DDT is defined in the metadata that is being scanned, it was just not scanned before a file that used it. And after all, isn't host model metadata something that can be scanned with this tool? Given that this is not part of the mission of this script, I think it would be great to suppress warnings that are not relevant to the checks being performed.

Ooh an interesting question. I think there's some complexity here because the script is meant to only parse "scheme" files and assumes there aren't "host" files present (as would be the case in a physics directory like ccpp-physics or atmospheric_physics). But there's no way I can think of to tell which files are host files (in order to skip or parse them). I could add an argument so the user could specify which files are host files and then deal with them somehow that way?

peverwhee commented 7 months ago

Is this test supposed to pass on our test files? If I try it in capgen_test or advection_test, I get something like:
Unknown DDT type, physics_state, at test/advection_test/test_host_mod.meta:52
This seems suboptimal since that DDT is defined in the tested metadata files.
The idea was to be able to run this on any ccpp scheme without relying on host model information. Simply compare scheme metadata with Fortran code. Therefore it's not surprising to me that the DDT isn't found and this complains. As long as these expected "errors" don't halt the execution of the script I think this is fine (maybe we can suppress that - I haven't tried the script myself yet, but I will today because I need it for the ccpp_prebuild optional argument implementation PRs).
I agree with the sentiment, however, in this case, the DDT is defined in the metadata that is being scanned, it was just not scanned before a file that used it. And after all, isn't host model metadata something that can be scanned with this tool? Given that this is not part of the mission of this script, I think it would be great to suppress warnings that are not relevant to the checks being performed.
Ooh an interesting question. I think there's some complexity here because the script is meant to only parse "scheme" files and assumes there aren't "host" files present (as would be the case in a physics directory like ccpp-physics or atmospheric_physics). But there's no way I can think of to tell which files are host files (in order to skip or parse them). I could add an argument so the user could specify which files are host files and then deal with them somehow that way?

I have now read the most recent reply from @gold2718 more thoroughly! I can indeed update the script/capgen to allow for ignoring the ddt check since those can come from the host (or I guess a scheme that has yet to be parsed).

gold2718 commented 7 months ago

I agree with the sentiment, however, in this case, the DDT is defined in the metadata that is being scanned, it was just not scanned before a file that used it. And after all, isn't host model metadata something that can be scanned with this tool? Given that this is not part of the mission of this script, I think it would be great to suppress warnings that are not relevant to the checks being performed.

Ooh an interesting question. I think there's some complexity here because the script is meant to only parse "scheme" files and assumes there aren't "host" files present (as would be the case in a physics directory like ccpp-physics or atmospheric_physics). But there's no way I can think of to tell which files are host files (in order to skip or parse them). I could add an argument so the user could specify which files are host files and then deal with them somehow that way?

I have now read the most recent reply from @gold2718 more thoroughly! I can indeed update the script/capgen to allow for ignoring the ddt check since those can come from the host (or I guess a scheme that has yet to be parsed).

Right. This has nothing to do with host vs. scheme. For instance, in capgen_test, there is the vmr_type ddt which is part of ddt_suite.xml (so a DDT in a scheme file).

peverwhee commented 7 months ago

@gold2718 Updated to skip the ddt checking from the offline script

peverwhee commented 7 months ago

@peverwhee This is great. It goes beyond comparing just the attributes in .Fortran/.meta files and does all of the "necessary Capgen compliance testing" (e.g. argument order checking, routines defined in both .F90/.meta, etc...). You could say this script is Capgen-lite, using just the scheme files. This would be a great thing to add to the ccpp-physics repo(s).

I ran this on the files in climbfuji/ccpp-physics#19, and couldn't get to the part comparing the attributes w/o commenting out a couple lines in _check_fortran_againstmetadata and _compare_fheader_to_mheader, which is just because these scheme files still need some cleanup for Capgen.

Yeah, I was worried about that. I can add a flag to skip order-checking?

mkavulich commented 6 months ago

@gold2718 Did you still have unresolved concerns on this PR?

climbfuji commented 6 months ago

@gold2718 Did you still have unresolved concerns on this PR?

@mkavulich I would also like to review this - please give me a few more days. Apologies for the delay!

peverwhee commented 6 months ago

Merged into #549

NCAR / ccpp-framework