tdwg / gbwg

Genomic Biodiversity Interest Group
Apache License 2.0
18 stars 2 forks source link

DwC Mapping - MIXS:0000009 lat_lon #12

Closed tucotuco closed 3 years ago

tucotuco commented 3 years ago
Field Value
subject_id http://rs.tdwg.org/dwc/terms/verbatimCoordinates
subject_definition The verbatim original spatial coordinates of the Location. The coordinate ellipsoid, geodeticDatum, or full Spatial Reference System (SRS) for these coordinates should be stored in verbatimSRS and the coordinate system should be stored in verbatimCoordinateSystem.
subject_usage_notes
subject_examples 41 05 54S 121 05 34W, 17T 630000 4833400
predicate_id skos:broadMatch
object_id MIXS:0000009
object_label lat_lon
object definition The geographical origin of the sample as defined by latitude and longitude. The values should be reported in decimal degrees and in WGS84 system
object source https://github.com/GenomicsStandardsConsortium/mixs-legacy/blob/master/mixs5/mixs_v5.xlsx
comment The Coordinate Reference System "epsg:4326" must be provided explicitly in dwc:verbatimSRS in order for the mapping to be complete.
ymgan commented 3 years ago

Related ticket of this field for MIxS v6: https://github.com/GenomicsStandardsConsortium/mixs/issues/62

raissameyer commented 3 years ago

Note for future self: For a reverse mapping (MIxS-DwC) in the future, we should also capture dwc:decimalLatitude and dwc:decimalLongitude.

msweetlove commented 3 years ago

Is it at all possible to map MIxS:lat_lon to DwC:decimalLatitude and DwC:decimalLongitide? Reasoning: the value for lat_lon is restricted to decimal degrees, as is decimalLatitude and decimalLongitide, while verbatimCoordinates can (and do) contain all kinds of different notations for coordinates. This makes indexing or working with any of the DarwinCore verbatim fields extremely difficult, and preferably to be avoided if a more standardized alternative is available. Since in MIxS:lan_lon the order of the values (first lat, then lon) and the separator (a space) are fixed, I think mapping to DwC:decimalLatitude and DwC:decimalLongitide could be made possible, will result in a more narrow match and would resolve the comment of the need to take DwC:verbatimSRS into account.

tucotuco commented 3 years ago

It seems like we need to be careful about the direction of the mapping. Anything that is not an exactMatch will have very different concerns in mapping based on the direction. The mappings I gave are all from MIxS to Darwin Core. As with many of the mappings, I chose the exactMatch where there was one, because all others are indirect and will require manipulation of the data. Going from DwC to MIxS, the decimalLatitude and decimalLongitude can be combined to populate lat_lon, but only if the dwc:geodeticDatum is epsg:4326 (WGS84). Any other mapping would require a datum transformation.

raissameyer commented 3 years ago

Picking up on @msweetlove 's comment, and introducing syntactic predicates as well, additional mapping options would be:

Field Value
subject_id http://rs.tdwg.org/dwc/terms/decimalLatitude
subject_definition The geographic latitude (in decimal degrees, using the spatial reference system given in geodeticDatum) of the geographic center of a Location. Positive values are north of the Equator, negative values are south of it. Legal values lie between -90 and 90, inclusive.
subject_value_syntax - expected_value - unit {float} - decimal degrees
subject_examples -41.0983423
predicate_id skos:broadMatch
syntax_predicate_id skos:broadMatch
object_id MIXS:0000009
object_label lat_lon
object definition The geographical origin of the sample as defined by latitude and longitude. The values should be reported in decimal degrees and in WGS84 system
object_value_syntax - expected_value - unit {float} {float} - decimal degrees
object source https://github.com/GenomicsStandardsConsortium/mixs-legacy/blob/master/mixs5/mixs_v5.xlsx
comment The DwC term only refers to half of the MIxS term
syntax_comment The DwC term expects decimal degrees, and so does MIxS - however, MIxS expects two input values, while DwC only expects one

and

Field Value
subject_id http://rs.tdwg.org/dwc/terms/decimalLongitude
subject_definition The geographic longitude (in decimal degrees, using the spatial reference system given in geodeticDatum) of the geographic center of a Location. Positive values are east of the Greenwich Meridian, negative values are west of it. Legal values lie between -180 and 180, inclusive.
subject_value_syntax - expected_value - unit {float} - decimal degrees
subject_examples -121.1761111
predicate_id skos:narrowMatch
syntax_predicate_id skos:narrowMatch
object_id MIXS:0000009
object_label lat_lon
object definition The geographical origin of the sample as defined by latitude and longitude. The values should be reported in decimal degrees and in WGS84 system
object_value_syntax - expected_value - unit {float} {float} - decimal degrees
object source https://github.com/GenomicsStandardsConsortium/mixs-legacy/blob/master/mixs5/mixs_v5.xlsx
comment The DwC term only refers to half of the MIxS term
syntax_comment The DwC term expects decimal degrees, and so does MIxS - however, MIxS expects two input values, while DwC only expects one

For completeness, verbatim latitude and verbatim longitude would also work

Field Value
subject_id http://rs.tdwg.org/dwc/terms/verbatimLatitude
subject_definition The verbatim original latitude of the Location. The coordinate ellipsoid, geodeticDatum, or full Spatial Reference System (SRS) for these coordinates should be stored in verbatimSRS and the coordinate system should be stored in verbatimCoordinateSystem.
subject_value_syntax - expected_value - unit verbatim
subject_examples 41 05 54.03S
predicate_id skos:narrowMatch
syntax_predicate_id skos:relatedMatch
object_id MIXS:0000009
object_label lat_lon
object definition The geographical origin of the sample as defined by latitude and longitude. The values should be reported in decimal degrees and in WGS84 system
object_value_syntax - expected_value - unit {float} {float} - decimal degrees
object source https://github.com/GenomicsStandardsConsortium/mixs-legacy/blob/master/mixs5/mixs_v5.xlsx
comment The DwC term only refers to half of the MIxS term
syntax_comment MIxS expects two input values in decimal degrees, while DwC only expects one as verbatim (so anything really)

and

Field Value
subject_id http://rs.tdwg.org/dwc/terms/verbatimLongitude
subject_definition The verbatim original longitude of the Location. The coordinate ellipsoid, geodeticDatum, or full Spatial Reference System (SRS) for these coordinates should be stored in verbatimSRS and the coordinate system should be stored in verbatimCoordinateSystem.
subject_value_syntax - expected_value - unit verbatim
subject_examples 121d 10' 34" W
predicate_id skos:narrowMatch
syntax_predicate_id skos:relatedMatch
object_id MIXS:0000009
object_label lat_lon
object definition The geographical origin of the sample as defined by latitude and longitude. The values should be reported in decimal degrees and in WGS84 system
object_value_syntax - expected_value - unit {float} {float} - decimal degrees
object source https://github.com/GenomicsStandardsConsortium/mixs-legacy/blob/master/mixs5/mixs_v5.xlsx
comment The DwC term only refers to half of the MIxS term
syntax_comment MIxS expects two input values in decimal degrees, while DwC only expects one as verbatim (so anything really)
raissameyer commented 3 years ago

Suggested syntax_predicates for dwc:verbatimCoordinates and MIXS:0000009:

Field Value
subject_id http://rs.tdwg.org/dwc/terms/verbatimCoordinates
subject_value_syntax - expected_value - unit verbatim
syntax_predicate_id skos:relatedMatch
object_id MIXS:0000009
object_value_syntax - expected_value - unit {float} {float} - decimal degrees
syntax_comment The DwC term expects a verbatim input (so anything really), whereas MIxS expects decimal degrees input as {float} {float}