Closed ekluzek closed 7 months ago
The other important script in tools/contrib that I think we need some level of support for is this one:
modify_singlept_site
But, I think it probably needs to be joined to the python module created for modify_singlept_site_neon.py and reworked for that to make sense.
And the users-guide should be updated to talk about these new scripts. I think @swensosc added something on this for his first version of the script, but it should be updated for the new one, and added as an alternative workflow.
@ekluzek I agree, if how you're suggesting we move these files and directories around.
Ideally the modify_singlept_site
and modify_singlept_site_neon
would be more similar (or merged), but this is currently not a high priority activity for the NEON project or the work @negin513 is leading.
@negin513, @danicalombardozzi and I will all work to improve the user guide related to the new scripts that have been created for NEON in the next few months.
To clarify my comments today.
modify_singlept_site
and modify_singlept_site_neon
), but keep them in the same directoryrun_singlept
or run_singlept_neon
if it's specific to NEON.Yes, I agree that having the single point functionality in a single directory is a nice feature, and I think the naming convention @wwieder proposed seems good. It would be great to make sure that the location & name of the directory is clear to users.
I updated to the top level description to be what we've decided here. I'm proposing that "tools/single_point" be the name of the sub-directory for these scripts.
This seems appropriate, @ekluzek . Out of curiosity, where are we putting subset_data
? It looks like you're suggesting in _tools/singlepoint?
To be clear, I'm supportive of this option because it's the first part of creating a surface dataset for a single_point script... BUT I but I wondered if others agreed since subset_data
can also be used for regional cases and it's going to be an important part of the tool_chain?
Yes, I'm suggesting subset_data goes into tools/single_point. But, maybe the directory should be called single_point_and_regional? And everything that revolves around regional or single-point goes into there. There are two critical paths that have to do with creation of inputdatasets, one is the regional workflow (that we are talking about here) that's based on starting with existing datasets, and the other is the creation of surface datasets from scratch for global (or also in some case regional as well). Both of those paths are important for the support of all the things that people need to do.
I agree @ekluzek that perhaps the directory name should be changed to reflect that it also contains scripts for regional cases. Would it be more expedient to call it "site_and_regional"? It might also help to have the script names specify whether they are for site or regional. Ultimately, I think we should include a ReadMe document in this directory can help to explain the purpose of each script.
@danicalombardozzi "site_and_regional" works for me.
As a first step to do this I'll move some of the existing scripts around (including PTCLM which is deprecated) and get a start to the documentation. @negin513 will add to that with the new scripts and functionality. At some point we'll also completely remove PTCLM.
@ekluzek - I think this was mistakenly closed. You referenced this issue in 6957418774f31c1061f5cbff8e729c432045556f, but I think that was a mistake. Maybe you meant to mention a different issue?
Yes, you are correct @billsacks. I referenced this to say I did partial work on it, but I didn't think it would auto-close unless you said "fixed" in front of the issue number. It looks like it will auto-close in that case, so I need to find a different way to reference partial work on an issue.
@ekluzek I think the problem was just that you referenced the wrong issue in 6957418774f31c1061f5cbff8e729c432045556f (referencing it in the PR was fine, I think).
neon_s3_upload and neon_surf_wrapper.py are both small, but should likely be done as well (moving it inside our python directory makes pylint, black, and unit-testing available).
I added a definition of done about this one. I think that we shouldn't worry about modify_singlept_site_neon, but it would be best to do this for run_neon. @wwieder and @TeaganKing and I talked about this some yesterday. We think this could be a good project for @TeaganKing to learn about python unit and system testing.
Now duplicated in #1906, I'm closing this issue
Reopening this for the last task of moving run_neon.py into the python directory. Issue #1906 really doesn't duplicate this one, it's a different concern.
From our planning meeting we decided that @TeaganKing and @slevis-lmwg would work on this for the hackathon. From reading the above, and our discussion today, it seems like both run_neon.py, modify_singlept_site_neon.py, and neon_surf_wrapper.py should have the same treatment done. So I'm adding them to the definition of done above.
@TeaganKing and @slevis-lmwg if you want I think you could get started on this before Wednesday. But, you should just decide if and how you want to do that. Part of the purpose of the Hackathon is to have all of us in the same room so we can work with each other as questions come up. But, it still might make sense to do some prep work, or a little bit of work to get your feet wet, so you can start off right away on Wednesday.
In working on this issue, we also noticed that the help comment for --inputdata-dir within modify_singlept_site_neon may be incorrect. It suggests that this directory is for writing to rather than reading from. This is used to pull standard input files from CESM input data such as the surf_soildepth_file.
During the hackathon today, we moved toolchain scripts (run_neon, neon_surf_wrapper, and modify_singlept_site_neon) and created wrappers for them. We updated the code formatting with black. We also ensured that the scripts were working in the same manner in which they were previously working. More thorough testing routines are still needed.
We will also need to move the new script from Will and Keith's PR into the python subdirectory.
This has already been done by the work that @TeaganKing and @slevis-lmwg did. I also marked off the PLUMBER2 part of this as that is coming off as an expansion of that work.
So closing...
We've introduced some critical single-point toolchain scripts at this point, and the top level scripts should be moved as a skeleton wrapper out of the tools/contrib directory to the main tools directory in a single_point subdirectory. The NEON scripts should be in the same directory with a _neon ending in the name. The guts should go into python modules that are under the python subdirectory. Since the guts are outside in the python subdirectory, there seems to be no reason to have subdirectories for each of these tools, and it's OK to have some duplication between the small top level skeleton scripts≥
There should also be unit tests added to important logic for the python modules in the python subdirectory. The unit tests should be connected to the python subdirectory unit-testing structure and be triggered with the other unit-tests that are there.
The top level tools/README file should also be updated to talk about how these new scripts fit into the workflow.
The system tools/tests should also continue to work for these tools in their new directory.
The first scripts that need to be moved are:
We talked about this at out July 22nd/2021 software meeting.
Definition of Done: