JCSDA-internal / ioda-converters

Various converters for getting obs data in and out of IODA
9 stars 2 forks source link

Changes to converters for VarBC files #1401

Closed CoryMartin-NOAA closed 9 months ago

CoryMartin-NOAA commented 10 months ago

This PR is intended to serve as a discussion for the new format of the VarBC files based on what was originally discussed in https://github.com/orgs/JCSDA-internal/discussions/70

This PR will be merged into the VarBC sprint branch and then, once the file format is agreed upon, final I/O changes will be made in UFO to be consistent.

CoryMartin-NOAA commented 10 months ago

It looks like we cannot have Variable or Record as both a dimension and a string? I think we can only put integers into the dimension variables, unless there is an IODA trick I don't know about.

mikecooke77 commented 10 months ago

Changes look good to me. The only point I was unsure about was the use of "Variable" as a name with the aircraft files. I understand Channels doesn't make sense. Is there an alternative name that is more descriptive?

srherbener commented 10 months ago

It looks like we cannot have Variable or Record as both a dimension and a string? I think we can only put integers into the dimension variables, unless there is an IODA trick I don't know about.

Ioda should be okay with a dimension being int, int64 or float. Technically you can create a dimension of type string, but I think you might run into issues downstream with that. I'm saying this mostly because we haven't explored using a string type for a dimension. I would recommend avoiding string for a dimension for now, but if a dimension of type string is valuable we can work on supporting that. For now would it work to do something like make "Variable" and "Record" int, and then provide string metadata variables that hold the mapping from the int values to the string values?

CoryMartin-NOAA commented 10 months ago

Thanks @srherbener, I currently have what you suggest, a dimension of type int and a string variable of size of the dimension. @mikecooke77 it is currently str Variables(Variable). I'm not too crazy about this as it seems confusing, but I don't know what to use. Would simulatedVariables be better or is that too verbose?