ESCOMP / CTSM

Community Terrestrial Systems Model (includes the Community Land Model of CESM)
http://www.cesm.ucar.edu/models/cesm2.0/land/
Other
308 stars 311 forks source link

Move critical toolchain scripts out of tools/contrib and their guts into the python subdirectory and unit tests added #1441

Closed ekluzek closed 7 months ago

ekluzek commented 3 years ago

We've introduced some critical single-point toolchain scripts at this point, and the top level scripts should be moved as a skeleton wrapper out of the tools/contrib directory to the main tools directory in a single_point subdirectory. The NEON scripts should be in the same directory with a _neon ending in the name. The guts should go into python modules that are under the python subdirectory. Since the guts are outside in the python subdirectory, there seems to be no reason to have subdirectories for each of these tools, and it's OK to have some duplication between the small top level skeleton scripts≥

There should also be unit tests added to important logic for the python modules in the python subdirectory. The unit tests should be connected to the python subdirectory unit-testing structure and be triggered with the other unit-tests that are there.

The top level tools/README file should also be updated to talk about how these new scripts fit into the workflow.

The system tools/tests should also continue to work for these tools in their new directory.

The first scripts that need to be moved are:

We talked about this at out July 22nd/2021 software meeting.

Definition of Done:

ekluzek commented 3 years ago

The other important script in tools/contrib that I think we need some level of support for is this one:

modify_singlept_site

But, I think it probably needs to be joined to the python module created for modify_singlept_site_neon.py and reworked for that to make sense.

ekluzek commented 3 years ago

And the users-guide should be updated to talk about these new scripts. I think @swensosc added something on this for his first version of the script, but it should be updated for the new one, and added as an alternative workflow.

wwieder commented 3 years ago

@ekluzek I agree, if how you're suggesting we move these files and directories around.
Ideally the modify_singlept_site and modify_singlept_site_neon would be more similar (or merged), but this is currently not a high priority activity for the NEON project or the work @negin513 is leading.

@negin513, @danicalombardozzi and I will all work to improve the user guide related to the new scripts that have been created for NEON in the next few months.

wwieder commented 3 years ago

To clarify my comments today.

danicalombardozzi commented 3 years ago

Yes, I agree that having the single point functionality in a single directory is a nice feature, and I think the naming convention @wwieder proposed seems good. It would be great to make sure that the location & name of the directory is clear to users.

ekluzek commented 3 years ago

I updated to the top level description to be what we've decided here. I'm proposing that "tools/single_point" be the name of the sub-directory for these scripts.

wwieder commented 3 years ago

This seems appropriate, @ekluzek . Out of curiosity, where are we putting subset_data? It looks like you're suggesting in _tools/singlepoint?

To be clear, I'm supportive of this option because it's the first part of creating a surface dataset for a single_point script... BUT I but I wondered if others agreed since subset_data can also be used for regional cases and it's going to be an important part of the tool_chain?

ekluzek commented 3 years ago

Yes, I'm suggesting subset_data goes into tools/single_point. But, maybe the directory should be called single_point_and_regional? And everything that revolves around regional or single-point goes into there. There are two critical paths that have to do with creation of inputdatasets, one is the regional workflow (that we are talking about here) that's based on starting with existing datasets, and the other is the creation of surface datasets from scratch for global (or also in some case regional as well). Both of those paths are important for the support of all the things that people need to do.

danicalombardozzi commented 3 years ago

I agree @ekluzek that perhaps the directory name should be changed to reflect that it also contains scripts for regional cases. Would it be more expedient to call it "site_and_regional"? It might also help to have the script names specify whether they are for site or regional. Ultimately, I think we should include a ReadMe document in this directory can help to explain the purpose of each script.

ekluzek commented 3 years ago

@danicalombardozzi "site_and_regional" works for me.

As a first step to do this I'll move some of the existing scripts around (including PTCLM which is deprecated) and get a start to the documentation. @negin513 will add to that with the new scripts and functionality. At some point we'll also completely remove PTCLM.

billsacks commented 3 years ago

@ekluzek - I think this was mistakenly closed. You referenced this issue in 6957418774f31c1061f5cbff8e729c432045556f, but I think that was a mistake. Maybe you meant to mention a different issue?

ekluzek commented 3 years ago

Yes, you are correct @billsacks. I referenced this to say I did partial work on it, but I didn't think it would auto-close unless you said "fixed" in front of the issue number. It looks like it will auto-close in that case, so I need to find a different way to reference partial work on an issue.

billsacks commented 3 years ago

@ekluzek I think the problem was just that you referenced the wrong issue in 6957418774f31c1061f5cbff8e729c432045556f (referencing it in the PR was fine, I think).

ekluzek commented 2 years ago

1461 does this for subset_data. We should also do this for modify_singlept_site_neon.py and run_neon.py for sure. run_neon.py, especially because we would like to have a base_class for run scripts, and then a NEON version and a generic tower site as well as run-regional version.

neon_s3_upload and neon_surf_wrapper.py are both small, but should likely be done as well (moving it inside our python directory makes pylint, black, and unit-testing available).

ekluzek commented 1 year ago

I added a definition of done about this one. I think that we shouldn't worry about modify_singlept_site_neon, but it would be best to do this for run_neon. @wwieder and @TeaganKing and I talked about this some yesterday. We think this could be a good project for @TeaganKing to learn about python unit and system testing.

wwieder commented 1 year ago

Now duplicated in #1906, I'm closing this issue

ekluzek commented 1 year ago

Reopening this for the last task of moving run_neon.py into the python directory. Issue #1906 really doesn't duplicate this one, it's a different concern.

ekluzek commented 1 year ago

From our planning meeting we decided that @TeaganKing and @slevis-lmwg would work on this for the hackathon. From reading the above, and our discussion today, it seems like both run_neon.py, modify_singlept_site_neon.py, and neon_surf_wrapper.py should have the same treatment done. So I'm adding them to the definition of done above.

@TeaganKing and @slevis-lmwg if you want I think you could get started on this before Wednesday. But, you should just decide if and how you want to do that. Part of the purpose of the Hackathon is to have all of us in the same room so we can work with each other as questions come up. But, it still might make sense to do some prep work, or a little bit of work to get your feet wet, so you can start off right away on Wednesday.

TeaganKing commented 1 year ago

In working on this issue, we also noticed that the help comment for --inputdata-dir within modify_singlept_site_neon may be incorrect. It suggests that this directory is for writing to rather than reading from. This is used to pull standard input files from CESM input data such as the surf_soildepth_file.

TeaganKing commented 1 year ago

During the hackathon today, we moved toolchain scripts (run_neon, neon_surf_wrapper, and modify_singlept_site_neon) and created wrappers for them. We updated the code formatting with black. We also ensured that the scripts were working in the same manner in which they were previously working. More thorough testing routines are still needed.

TeaganKing commented 1 year ago

We will also need to move the new script from Will and Keith's PR into the python subdirectory.

ekluzek commented 7 months ago

This has already been done by the work that @TeaganKing and @slevis-lmwg did. I also marked off the PLUMBER2 part of this as that is coming off as an expansion of that work.

So closing...