JCSDA-internal / ioda-converters

Various converters for getting obs data in and out of IODA
9 stars 2 forks source link

GMAO Converter Update #1528

Closed Dooruk closed 5 days ago

Dooruk commented 1 month ago

Previous GMAO converter code was ineffective and slow, this update resolves that issue.

Issue(s) addressed

Resolves #1218

Checklist

Dooruk commented 1 month ago

the main improvements besides readability is the replacement of an explicit loop with use of numpy where, any other nugget you'd like to pass on that gave significant performance improvement

Yes, preallocating full arrays with the _FillValue helped a lot on the performance. I tested it with a SMAP file containing upwards of 100k observations and conversion was instant.

I added @srherbener as a reviewer but i wasn't able to add @PatNichols for some reason, not sure who would be more appropriate for the approval.

srherbener commented 1 month ago

Note that Pat Nichols has left JCSDA (which is why you can't add him to the reviewer list).

BenjaminRuston commented 1 month ago

I'll add @travissluka as another set of eyes

BenjaminRuston commented 1 month ago

@srherbener didn't this get cleaned up, can you review your change request and remove if now obsolete

Dooruk commented 1 month ago

Following the comments on the QC flags, I will make some additional changes to this. Also I will add some fixes to the time dimension after we made some changes in the ODAS outputs. Sorry, didn't want to bother folks with more notifications until I made those changes and can't seem to change this to draft.

srherbener commented 1 month ago

@Dooruk under the Reviewers list (right sidebar) there is a link you can click on to convert this to a draft PR. The link reads "Still in progress? Convert to draft". Or is the issue that you tried the link and you don't have permission to change the PR status? If it's the latter let me know and I can change this PR to a draft. Thanks!

Dooruk commented 1 month ago

@Dooruk under the Reviewers list (right sidebar) there is a link you can click on to convert this to a draft PR. The link reads "Still in progress? Convert to draft". Or is the issue that you tried the link and you don't have permission to change the PR status? If it's the latter let me know and I can change this PR to a draft. Thanks!

Didn't see that my bad, now it's merely a draft.

srherbener commented 1 month ago

No worries, I always have to search around to find that link. They should make it more obvious.

Dooruk commented 2 weeks ago

This is ready for review again. One change I introduced is that now it uses the datetime information from the input data, which wasn't the case before.

Our GMAO/ODAS pre-processing is not as sophisticated to apply @gthompsnJCSDA's suggestions but maybe in the future we could once we improve upon it:)

https://github.com/JCSDA-internal/ioda-converters/issues/1218#issuecomment-2229239684

huishao-r commented 1 week ago

@srherbener would you please take another look given you commented before?

BenjaminRuston commented 1 week ago

really @gthompsnJCSDA 2 as default that very counter intuitive to me all observations even those that are 0 undergo QC anything non-zero is consider a fail imho, the use of 2 is non-generic and non-standard and refers to a NOAA-EMC table? never heard of this table and it sounds again very specific and do not wish to put in agency specifics

@Dooruk do you know any more on this. is my thinking off, am I missing something?

gthompsnJCSDA commented 1 week ago

really @gthompsnJCSDA 2 as default that very counter intuitive to me all observations even those that are 0 undergo QC anything non-zero is consider a fail imho, the use of 2 is non-generic and non-standard and refers to a NOAA-EMC table? never heard of this table and it sounds again very specific and do not wish to put in agency specifics

@Dooruk do you know any more on this. is my thinking off, am I missing something?

In the "big picture" view, it really doesn't matter what anyone picks for the starting value, but very typical use in files converted from both EMC and NASA have used the value 2 since they are anchored mostly in prepBUFR use. I think I explained it pretty well. No one has to agree, but typical usage/convention of these two groups would seem better than switching things arbitrarily even if we don't agree on the rather ancient Table 7. Nothing makes JEDI's QCflags.h anything special either.

BenjaminRuston commented 1 week ago

i've re-read the comments and agree with @Dooruk on this one.

a PreQC should not be limited to those from the data provider (often in the form of quality bits) but can also be physical reality checks such as those Dooruk mentioned.

further the PreQC with a value of 0 should NOT be assumed good and subsequent checks in the UFO should always be performed.

Dooruk commented 5 days ago

If 0 still has to go through UFO machinery that is good enough for me, at least for now. Appreciate the discussion!

Thanks again @gthompsnJCSDA. I shared this discussion and the NCEP table with our obs groups to benefit people like me who weren't aware of this standard.