ADAPT / Standard

ADAPT Standard data model issue management
https://adaptstandard.org
MIT License
7 stars 1 forks source link

Floating Point Type Precision and Variable Units of Measure #128

Open knelson-farmbeltnorth opened 11 months ago

knelson-farmbeltnorth commented 11 months ago

Up to this time, the ADAPT Standard committee has agreed that

  1. All units of measure on ADAPT Standard Data Type Representations will be in SI base units. E.g., m3/m2 for volume/area.
  2. We will only support Integer and Double data types in GeoTiff & GeoParquet

In working to create examples, I mocked up the attached example field with a significant number of attributes, simulating seed rate data from a planter with independent sensors per row. The difference in file size between the seed rate data as doubles vs. floats (singles) is significant.

So long as we require SI based units, we are obliged to manage data in doubles, numbers may be very large or very small and single precision will almost certainly be inadequate. However, if we curate the units of measure as we have curated the known data types, can we not choose a unit that will result in values in expected ranges? That is, can we not choose units that will make singles/floats viable?

float_v_double.zip

knelson-farmbeltnorth commented 10 months ago

In preference to doubles in base SI units that may not be particularly understandable without conversion, using 32-bit integers with a defined offset per data type representation (as 11783-10 binary files do) may be a practical solution to avoid losing precision with singles.

knelson-farmbeltnorth commented 10 months ago

Agreement in 8 November 2023 meeting that we will keep all floating points in doubles. The storage optimization from GeoParquet alone is signficant enough, and keeping things as doubles eliminates several issues.