JCSDA-internal / ioda-converters

Various converters for getting obs data in and out of IODA
8 stars 2 forks source link

Make IMSFV3 converter more robust #1497

Closed CoryMartin-NOAA closed 1 month ago

CoryMartin-NOAA commented 2 months ago

https://github.com/JCSDA-internal/ioda-converters/blame/0bc00802515fe5834fb70311bfcac3f4cf12d349/src/land/imsfv3_scf2ioda.py#L101

This tries to get the date from the filename, and works fine if:

/path/to/file/20240416_sample.nc4 but if one is doing something like:

/path/to/jenny/86753099/work/20240416_sample.nc4 it will find the wrong 8 digit integer.

First found by @aerorahul as part of global-workflow CI, as the hash happened to be an 8 digit integer.

jiaruidong2017 commented 2 months ago

We can add a line to find the basename (20240416_sample.nc4) first and then get the date from the basename.

YoulongXia-NOAA commented 2 months ago

The best way is IMS processing code to output time into nc4 file, the ioda converter read time directly from there. This also happened for SMAP ioda converter. We finally solve it by reading date and time from hd5 file directly. @jiaruidong2017, if you have experience to add a line to find the basename and then get the date, that is great. It should not be hard to add time into the repocessed nc4 file from ims processing code. Here I include @ClaraDraper-NOAA here.

CoryMartin-NOAA commented 2 months ago

I agree @YoulongXia-NOAA it's dangerous to infer from filenames and if we can put it in the file itself that would be the best solution.

CoryMartin-NOAA commented 2 months ago

@WalterKolczynski-NOAA suggested here: https://github.com/NOAA-EMC/global-workflow/pull/2484#issuecomment-2059851873 some regex that would be a quick fix.

WalterKolczynski-NOAA commented 2 months ago

My solution there only works if the date is always delimited by periods. If that isn't the case (as in the example above), it won't work as-is. I agree with @YoulongXia-NOAA reading the date directly from the file is the best long-term solution.

YoulongXia-NOAA commented 2 months ago

I am working on this issue. Example data has be produced and the PR (https://github.com/NOAA-PSL/land-IMS_proc/pull/5) is under review. Offline modified imsfv3_scf2ioda.py which reads time directly from data file has been checked and it works. Now I am planning to work on iodacov within ioda-bundle to update imsfv3_scf2ioda.py.