kaiiam / mifc

A minimum information standard checklist formalizing the description of food composition data and related metadata.
MIT License
2 stars 1 forks source link

Float constraint for longitude and latitude runs into problems with degree sign and N/S E/W #14

Open GregLyonJIFSAN opened 2 weeks ago

GregLyonJIFSAN commented 2 weeks ago

It looks like the PTFI metadata values for longitude and latitude are formatted in a way that is not compatible with the range constraint in mifc.yaml. The Foundation Foods QC platform logged the following error when trying to convert the TSV sample data to JSON using the linkml-convert utility:

ValueError: could not convert string to float: '37.2296° N'

Maybe we could make this field a string and validate it using a regex? Alternatively, it could be a tuple with a float component for degrees and a character for N/S/E/W. I don't know if we would expect values expressed in degrees, minutes, seconds, but that might be another possibility to keep in mind.

These are the relevant fields in mifc.yaml:

  food_acquisition_latitude: #TODO add numeric range constraint
    description: A float representing the latitude of the place from which the primary_food_type was acquired.
    range: float
    exact_mappings:
      - PTFI:Collection_Latitude
  food_acquisition_longitude: #TODO add numeric range constraint
    description: A float representing the longitude of the place from which the primary_food_type was acquired.
    range: float
    exact_mappings:
      - PTFI:Collection_Longitude
kaiiam commented 2 weeks ago

Thanks for raising the issue @GregLyonJIFSAN it's true that there are several different ways latitude and longitude can be expressed with decimals with the degree and the cardinal direction (North, South, East West etc). The current implementation is expecting the decimal degree version.

One potential solution is to force everyone to use it that way. Perhaps a better solution would be to have separate attributes for the lat/long units. This would mirror the patter of how we have component_measurement_unit which pairs with the slot component_recorded_value.

For the above data item, '37.2296° N' it would need to be separated I'd propose the two new slots:

food_acquisition_latitude: 37.2296 food_acquisition_latitude_unit: deg{N} or similar if using the UCUM system. Which the standard is also already using.

Thoughts @GregLyonJIFSAN ?

GregLyonJIFSAN commented 2 weeks ago

Hello @kaiiam. I did a little research this morning and it looks like one advantage of sticking with decimal degrees is that these are used in GIS systems and also by the Google Maps and Open Street Map APIs. For example, JIFSAN has a map of training locations and these are stored as decimal degrees in our database with the +/- sign representing N/S and E/W:

https://jifsan.umd.edu/monitor/map

If we are planning on working with the food sample data on an automated map or with GIS, it might be more convenient to continue using decimal degrees. It would be easy to convert N/S/E/W to the appropriate sign in the QC application. Do you have a sense of how FDC will be using the location data?

Alternatively, if making the unit explicit is more important, we can use the two slots you proposed above.

I think the location of the Prime Meridian is standard, but we could mention it in mifc.yaml. It looks like this is the current one used by GPS:

https://en.wikipedia.org/wiki/IERS_Reference_Meridian

Please let me know what you think.