NOAA-PSL / observation-archive

Tools to assemble observational archive record
Apache License 2.0
0 stars 0 forks source link

add yaml config and cycling through dates and sources #7

Closed frolovsa closed 2 years ago

frolovsa commented 2 years ago

This adds a simple script driver that allows cycling through times and sources specified in the yaml file.

Question to reviewers. This is written in very simple python. Since this is my first medium-size python project, i decided to keep it simple. We can add classes or cli interface later as needed. However, if you guys have some suggestions on what would be appropriate at this time. I would be happy to incorporate.

Also this code uses utils from one of the Steve Lawrence's projects. It has a nice interface that allows specifying date ranges.

closes #2

jswhit commented 2 years ago

@frolovsa this looks good - not bad for your first stab at python! I guess I'm still a bit unclear on how we will actually use this though - but that will come once we have some data there. Maybe we can add some github actions to exercise the features? (don't know if github actions can access s3)

frolovsa commented 2 years ago

@jswhit thank you. Ideally we would add a pytest that would copy a few files and then check that that was successful. At that point it would make sense to re-write my main script as a set of functions that can be called by the pytest.

HenryRWinterbottom commented 2 years ago

@jswhit thank you. Ideally we would add a pytest that would copy a few files and then check that that was successful. At that point it would make sense to re-write my main script as a set of functions that can be called by the pytest.

@frolovsa A simple unit-test that I have implemented for the UFS-RNR YAML reader is to just have a static yaml file -- something with several of nests of foo and bar type stuff and a Python dictionary that pyyaml dumps out. Then you can just assert whether the dictionaries are identical. You may want to use collections.OrderedDict to make sure the orderings are identical or the assert may fail.

Otherwise this looks good. Approving.

HenryRWinterbottom commented 2 years ago

Sorry -- I wasn't a reviewer. Regardless, it looks good to me. Welcome to the world of Python.

jswhit commented 2 years ago

let's see if a simple github actions workflow can be set up to list some directories and/or copy some files on s3

HenryRWinterbottom commented 2 years ago

@frolovsa @jswhit https://github.com/marketplace/actions/s3-sync

jswhit commented 2 years ago

@HenryWinterbottom-NOAA not sure if that's what is needed, that does all the uploading and copying itself. seems like we want to use the python modules in this repo to do that from within a github action.

HenryRWinterbottom commented 2 years ago

@jswhit Ah, OK. My fault. I misunderstood.

HenryRWinterbottom commented 2 years ago

@frolovsa One more comment and I promise the last from me.

If you wanted to maintain consistency against the UFS-RNR code base package, the module https://github.com/noaa-psd/UFS-RNR/blob/feature/ush_tools/ush/tools/datetime_interface.py does a lot of the heavy lifting with respect to time and date attributes.

You can absorb this into your environment by defining the PYTHONPATH environment variable.

frolovsa commented 2 years ago

@HenryWinterbottom-NOAA I am not too concerned about consistency across packages yet. I see this as more of a scripting aid with limited shelf time rather than a big package we will maintain for a while. I took a look at your date utils i will definitely borrow them if need arises.