dtcenter / METplus

Python scripting infrastructure for MET tools.
https://metplus.readthedocs.io
Apache License 2.0
97 stars 37 forks source link

Update IODA use cases to automatically obtain IODA file from JCSDA #2393

Open DanielAdriaansen opened 11 months ago

DanielAdriaansen commented 11 months ago

Describe the Enhancement

There are two current IODA use cases: https://metplus.readthedocs.io/en/latest/generated/model_applications/data_assimilation/StatAnalysis_fcstHAFS_obsPrepBufr_JEDI_IODA_interface.html#sphx-glr-generated-model-applications-data-assimilation-statanalysis-fcsthafs-obsprepbufr-jedi-ioda-interface-py

https://metplus.readthedocs.io/en/latest/generated/model_applications/data_assimilation/StatAnalysis_fcstGFS_HofX_obsIODAv2_PyEmbed.html#sphx-glr-generated-model-applications-data-assimilation-statanalysis-fcstgfs-hofx-obsiodav2-pyembed-py

This issue is to: 1) Determine whether multiple "versions" of IODA need to be supported, or whether a single use case for the latest/current version of IODA is sufficient 2) Modify the IODAv2 use case to ingest the latest IODA file from JCSDA automatically during CI, which will capture any errors using the new IODA file with CI and prompt changes

Time Estimate

TBD

Sub-Issues

Consider breaking the enhancement down into sub-issues.

Relevant Deadlines

List relevant project deadlines here or state NONE.

Funding Source

Define the source of funding and account keys here or state NONE.

Define the Metadata

Assignee

Labels

Projects and Milestone

Define Related Issue(s)

Consider the impact to the other METplus components.

Enhancement Checklist

See the METplus Workflow for details.

jeromebarre commented 11 months ago

@climbfuji As discussed internally before, we need to generate an IODA file example from our CI testing and either store it in an accessible location or push it directly to DTC.

jeromebarre commented 11 months ago

ftp push option here: https://github.com/dtcenter/METplus/discussions/954

climbfuji commented 11 months ago

I think I can make ftp work, although S3 is a nicer option. @jeromebarre Please provide me with the name of an output file and the experiment that generates it (ideally one that already runs in CI nightly, see https://github.com/JCSDA-internal/skylab/blob/develop/.github/workflows/run_ec2_pcluster.yaml). Thanks!

DanielAdriaansen commented 11 months ago

Here is the current IODAv2 file we are using: https://dtcenter.ucar.edu/dfiles/code/METplus/METplus_Data/v6.0/model_applications/data_assimilation/StatAnalysis_fcstGFS_HofX_obsIODAv2_PyEmbed/sample_hofx_output_sondes.nc4

jeromebarre commented 11 months ago

We would need to provide the fb IODA files (there should normally be 3) that are produced by gfs-3dfgat-c12 once the variational task is complete.

jeromebarre commented 11 months ago

I think I can make ftp work, although S3 is a nicer option. @jeromebarre Please provide me with the name of an output file and the experiment that generates it (ideally one that already runs in CI nightly, see https://github.com/JCSDA-internal/skylab/blob/develop/.github/workflows/run_ec2_pcluster.yaml). Thanks!

We just had a meeting and we ruled out the FTP option and S3 bucket is the better option.

climbfuji commented 11 months ago

@jeromebarre I need to know the exact names of the files and their locations within the experiment. Is it these? Which one of those do you want to upload to an S3 bucket (easiest is all of them)?

(venv) ubuntu@ip-10-0-1-189:~/skylab/manual_run$ ls -lart workdir/f72620/20201215T000000Z/obs/
total 19736
-rw-r--r-- 1 ubuntu ubuntu    68495 Nov  6 16:36 obs.AMSUA_N19.20201215T000000Z.nc4
-rw-r--r-- 1 ubuntu ubuntu   314639 Nov  6 16:36 obs.Sondes.20201215T000000Z.nc4
-rw-r--r-- 1 ubuntu ubuntu   379411 Nov  6 16:36 obs.Aircraft.20201215T000000Z.nc4
-rw-rw-r-- 1 ubuntu ubuntu  4262627 Nov  6 16:38 fb.Aircraft.20201215T000000Z.nc4
-rw-rw-r-- 1 ubuntu ubuntu 10238866 Nov  6 16:38 fb.AMSUA_N19.20201215T000000Z.nc4
drwxrwxr-x 2 ubuntu ubuntu     4096 Nov  6 16:38 .
-rw-rw-r-- 1 ubuntu ubuntu  4900380 Nov  6 16:38 fb.Sondes.20201215T000000Z.nc4
drwxrwxr-x 4 ubuntu ubuntu    28672 Nov  6 16:38 ..
climbfuji commented 11 months ago

@jeromebarre Can you please point me to the corresponding issue in JCSDA/JCSDA-internal? Thanks.

climbfuji commented 11 months ago

This has been added, please see https://github.com/JCSDA-internal/skylab/pull/191 for the PR (if you can). The data is available at s3://jedi-test-files/ci/met-ioda

DanielAdriaansen commented 11 months ago

@climbfuji I cannot view the PR, and I am also having difficulty accessing the S3 bucket. Is it public? If not, let me know how you want to handle that.

climbfuji commented 11 months ago

@climbfuji I cannot view the PR, and I am also having difficulty accessing the S3 bucket. Is it public? If not, let me know how you want to handle that.

It's public, but I think the easiest way for you to access the data is via http (curl, wget):

https://jedi-test-files.s3.amazonaws.com/ci/met-ioda/fb.Aircraft.20201215T000000Z.nc4
https://jedi-test-files.s3.amazonaws.com/ci/met-ioda/fb.AMSUA_N19.20201215T000000Z.nc4
https://jedi-test-files.s3.amazonaws.com/ci/met-ioda/fb.Sondes.20201215T000000Z.nc4
DanielAdriaansen commented 11 months ago

Thanks! We were able to access and open the files.