dtcenter / MET

Model Evaluation Tools
https://dtcenter.org/community-code/model-evaluation-tools-met
Apache License 2.0
77 stars 24 forks source link

Feature 2717 convert unit.pl to unit.py #2871

Closed natalieb-noaa closed 3 months ago

natalieb-noaa commented 5 months ago

Expected Differences

Pull Request Testing

Pull Request Checklist

See the METplus Workflow for details.

georgemccabe commented 5 months ago

I made a few fixes in your branch to fix a few errors I encountered. There were a few minor inconsistencies in a few of the XML files (exists used instead of exist, test_dir not included) that were just ignored in the perl version but caused a crash in the python version. This is good because it caught some bugs where tests weren't doing what they were supposed to do!

UPDATE: I found/fixed the issue. There was a tab at the end of the line before the output file line in the XML file.

I ran into a weird issue that I can't figure out what is happening. When I run:

./python/unit.py ./xml/unit_ref_config_lead_12.xml

TEST: pcp_combine_wrf_3hr_09_12 fails. I see that the command is output correctly, but the log from MET shows that the output file path is not included.

natalieb-noaa commented 5 months ago

This looks great! I made a few suggestions in-line. I have not yet run this locally, but I plan to do that shortly. Also, I think it would be good to update the automated tests to run using this new version so we can review any differences through the GitHub Actions workflow run for this PR. I can make those changes to this branch.

Thanks for the feedback, updates, and fixes! I'll take a look at your comments and commits...

And if it'll be quick for you to update the automated tests to use this, please go ahead and do that! (EDIT: just saw that you already did, thanks!)

JohnHalleyGotway commented 4 months ago

I'm sorry for the very long delay on this.

I tried out these commands today on seneca in /d1/projects/MET/MET_pull_requests/met-12.0.0/beta5/MET-feature_2717_convert_unit.pl_to_unit.py.

  1. There's a very minor difference in logging: OLD with unit.pl reports status on the same line:

    TEST: ascii2nc_TRMM_3hr            - pass -  13.912 sec

    NEW with unit.py puts them on separate lines:

    TEST: ascii2nc_TRMM_3hr
    - pass -    13.983 sec

    This seldom used regression_runtimes.ksh script assumes the former. I looked into the logger terminator to omit the newline, but it's probably better to just update the logic of that runtimes script at some point or replace it with a Python version.

  2. The PERL version accepts multiple input xml files to run:

    perl/unit.pl xml/unit_ascii2nc.xml xml/unit_pb2nc.xml

    The Python version fails with a parsing error:

    python/unit.py -log py.log xml/unit_ascii2nc.xml xml/unit_pb2nc.xml
    usage: unit.py [-h] [-log log_file] [-cmd] [-memchk] [-callchk] [-noexit] test_xml
    unit.py: error: unrecognized arguments: xml/unit_pb2nc.xml

    Can this logic be updated to support multiple XML input files in a single call?

  3. I tested the -cmd, -memchk, -callchk, and -noexit options and they all work as I'd expect.

  4. The -log option also work. However, I note that if the log file already exists, it is appended rather than clobbered. I can't really decide what I prefer. I'm used to MET clobbering it's output files, including log files, rather than appending to them.

How do you think this should behave? Should we be appending or replacing?

natalieb-noaa commented 4 months ago

Thanks for the feedback, @JohnHalleyGotway !

  1. Right. I could use print statements instead of logging if we wanted the status output to look the same. Using logging feels like better programming practice to me (more robust and flexible) as this library continues to get updated, but I wasn't sure if the format change would be a problem for anyone. If this change only causes a problem for a seldom-used script, then I'd be inclined to keep it as is and just update that script, as you suggested.

  2. Good catch! That didn't even occur to me. I should be able to update it to allow more than one input xml file.

  3. I think overwriting the log files would be reasonable. I can't think of a reason to continually append, especially if that's not what you're used to. Also, if we keep appending, then we'll have to eventually handle the files getting too large. I'll make the change to overwrite the log files.

natalieb-noaa commented 3 months ago

@JohnHalleyGotway I made the changes you suggested: