JohnLCaron / cdm-kotlin

2 stars 1 forks source link

Variable and fixed length Strings #109

Closed JohnLCaron closed 1 year ago

JohnLCaron commented 1 year ago

Consider making CHAR mean fixed length string. use ArrayUByte.convertToStrings(charset?) to convert. ?? Does ArrayUByte then need to know the default charset, when its read from Variable of type CHAR ? Or keep charset at the Variable and Attribute ??

CHAR is converted to String for Attributes. Can we keep the underlying Ubytes? Too much trouble ??

Then String means variable length string from the User POV.

Compound types need to distinguish mostly for storage purposes.

JohnLCaron commented 1 year ago

Keep the API datatype and storage datatype separate.

JohnLCaron commented 1 year ago
  1. char attributes are always assumed to be Strings.
  2. Netcdf-3 often uses char to mean ubyte, since there is no unsigned byte. Therefore, char variables return ArrayUByte. If user wants strings, they call fun ArrayUByte.makeStringsFromBytes(): ArrayString
  3. Netcdf4 also covers Netcdf3, so we need to allow for CHAR variables that arent STRING. Apparently it encodes CHARs as fixed length strings with elemSize = 1. So we use that to use CHAR. There is a possible ambiguity with a String of size 1. SO far we havent been tripped up by that.
  4. HDF5 doesnt have Char, but it does have both vlen and fixed length String types. The user shouldnt case which. the library has to track internally.
JohnLCaron commented 1 year ago

PR #112 fixes