Call MDTF from ADF, and allow list of hist_str to be processed for timeseries

bitterbark commented 5 months ago

Call MDTF from ADF

This adds a section in the config and run files so the ADF can directly call the MDTF (Model Diagnostics Task Force (MDTF)-Diagnostics package) .

This is the simplest possible implementation that calls an existing, installed version of the MDTF using existing, installed conda envs. For nowl, timeseries files will be copied to the requested MDTF directory and file names. Imminent data catalogue functionality in MDTF will change this.

Basic documentation of new functionality: in config.yaml, there is a new section:

diag_mdtf_info:

Run the MDTF on the model cases
mdtf_run: false

If mdtf_run:true:

run_adf_diag

Call the MDTF:
if diag.get_mdtf_info('mdtf_run'):
   diag.setup_run_mdtf()

[Detailed documentation of implementation/call tree] (https://www2.cgd.ucar.edu/cms/bundy/Projects/diagnostics/mdtf/notes/run_mdtf_from_adf.html#Documentation) More documentation for MDTF

Allow hist_str to be a list The config.yaml file variable can now also be a list: hist_str: [cam.h2,cam.h0]

These timeseries files are all placed in the same ts/ directory, which may need to be modified for multiple component hist strings. (The MDTF checks file types for time frequency for its own purposes).

bitterbark commented 4 months ago

@nusbaume I'm trying to fix the errors/warnings but I'm confused by the very first

lib/adf_diag.py:373:8: W0601: Global variable 'call_ncrcat' undefined at the module level (global-variable-undefined)

There is no between mine and the upstream repo.

def create_time_series(self, baseline=False):
    """                                                                                                                             
    Generate time series versions of the CAM history file data.                                                                     
    """

    global call_ncrcat

    def call_ncrcat(cmd):

Do I need to do something?

bitterbark commented 2 months ago

Github and I did not get along very well through this process, so in order to remove the final conflicts, I had to use the web version of the file. This might have been better if it hadn't involved indentations, but I did what I could. However, I was unable to test it. (I merged the code over on my branch but again, something went wonky, so I couldn't even just throw in a new commit). So the problem is: I literally haven't been able to run this code to test it before hitting commit.

Another issue: the reformatting I had done in order to pass the linting tests was sometimes thrown out in the merge. I guess Justin's code is so clean that he can get away with these problems? I am reluctant to delay getting this in any longer as re-merging can take a week of my time. Can I get a pass on those?

nusbaume commented 2 months ago

Hi @bitterbark, it looks like the Github test failure is due to an actual python syntax error:

lib/adf_diag.py:491:20: E0001: Parsing failed: 'unindent does not match any outer indentation level (adf_diag, line 491)' (syntax-error)

So we'll have to fix it in order for this PR to be accepted (otherwise the ADF itself would break).

After that do you think this PR is ready for a re-review? If so I'll check it over and hopefully have @justin-richling run a test to make sure it runs as expected. If everything goes well we can hopefully get this PR merged in soon!

bitterbark commented 2 months ago

Unfortnately the linter found an indent error. I am going back and merging by hand. Will update with a fresh pull request.

bitterbark commented 2 months ago

Closing this in lieu of new PR

NCAR / ADF

Call MDTF from ADF, and allow list of hist_str to be processed for timeseries #293

Run the MDTF on the model cases

Call the MDTF: