tdwg / dwc-qa

Public question and answer site for discussions about Darwin Core
Apache License 2.0
49 stars 8 forks source link

Specific label for utm coordinates information #158

Open marcelooyaneder opened 4 years ago

marcelooyaneder commented 4 years ago

Hi i'm writing a code to get a dataset with the label verbatimCoordinates to standard DwC (decimalLatitude,decimalLongitude), i've done some testing and it works great, but i need to give more info to the code, e.g: the zone and if its north or south.

I've read the labels definitios from the page, but i couldn't find a term that suit the need. Any advice for this problem?

qgroom commented 4 years ago

I'm not sure I understand your question, but decimalLatitude is positive or negative depending on whether it is north or south. Details of the coordinate systems should be entered into the verbatimCoordinateSystemProperty and/or geodeticDatumProperty terms.

tucotuco commented 4 years ago

Hi Marcelo,

Apologies, but I am going to try to be thorough.

As I understand it you would like to know in which field(s) to capture the UTM coordinate information. Because UTM coordinates require the Zone in order to be unambiguously specified, the option to put these data in verbatimLatitude and verbatimLongitude is insufficient. Thus, the field verbatimCoordinates is the best option, accompanied by "UTM" in the field verbatimCoordinateSystem. But that is not all you should do. This is a case where you might very easily make the data very precise or lose all that precision simply by not specifying three additional fields that might be trivial to populate in your case. The three fields are verbatimSRS, geodeticDatum, and coordinateUncertaintyInMeters. The term verbatimSRS is meant to capture the coordinate reference system (CRS, also called a spatial reference system or SRS) of the verbatim coordinates, i.e., the UTM coordinates in this case. The best way to populate that is to get the EPSG code for the UTM coordinates, for example, EPSG:23032 for UTM Zone 32N using the European Datum of 1950. This is a CRS in common use with UTM coordinates in much of Europe and nearby regions. Note on that epsg.io reference page that the CRS has an accuracy of 10m. Keep that in mind for the moment. The term geodeticDatum is meant to capture the CRS of the decimalLatitude and decimalLongitude after the transformation is made from the UTM coordinates in their CRS. Here again it is best to use the EPSG code for the transformed coordinates. Unless you have a particular reason not to, it is best to use epsg:4326 (WGS84) for the geodeticDatum and make the appropriate transformation for the verbatimSRS and UTM coordinates. The term coordinateUncertaintyInMeters is meant to capture the maximum distance from the coordinates defined by decimalLatitude/decimalLongitude/geodeticDatum within which the actual Location might be found. The uncertainty is the combination of all sources that can contribute to inaccuracy or ambiguity of the final coordinates. The potential sources for uncertainty are many. In your case, you begin with UTM coordinates, presumably in a known CRS. If the CRS for the UTM coordinates is not known, that complicates matters immensely - the contribution of the uncertainty due to an unknown verbatimSRS if the geodeticDatum is set to WGS84 (EPSG:4326) is location-dependent and ranges from 1554m to 5358m. The only tool I know of to get that uncertainty is the Georeferencing Calculator (http://georeferencing.org/georefcalculator/gc.html), and a data set for this source of uncertainty by location can be found in the accompanying file https://github.com/VertNet/georefcalculator/blob/master/datumerrordata.js. If you know, or can find the original CRS, you can avoid this complication. Instead, the contribution from the accuracy of the CRS would be the same for every record, and given by the characteristics of the CRS as found on epsg.io (e.g., for the EPSG:23032 example above, the accuracy of the CRS is given as 10m). Another source of uncertainty for your case is the size of the UTM grid being used, If it is fully specified to the nearest meter, then the uncertainty due to the coordinate precision is 0.7m (half the diagonal of the 1mx1m grid). That's not much, but if the precision is less, the associated uncertainty is proportionately greater (7m for a 10m grid, 71m for a 100m grid, 707m for a 1km grid, etc.). Finally, the source of the UTM coordinates is also a source of uncertainty. If they were measured from a map, the scale of the map is a contributing factor, as is the ability of the person measuring to pinpoint the location. If the coordinates came from a GPS, then the accuracy of each coordinate taken is dependent on the conditions at the time they were recorded. If the GPS accuracy was not recorded in the moment, then a default conservative value of 30m can be used, unless the readings were taken before 1 May 2000, in which case 100m is a better default (Zermoglio, et al. 2020). That's a lot of information, and may be way more than you sought (or not what you sought at all, in which case let me know), but we try to use these questions as the basis of further documentation of Darwin Core wherever needed. The good thing about your question is that there is now extensive documentation solicited by GBIF going out very soon for public review. The most comprehensive is Georeferencing Best Practices (Chapman and Wiecorek 2020), and this is accompanied by the Georeferencing Quick Reference Guide (Zermoglio et al. 2020), and the Georeferencing Calculator Manual (Bloom et al. 2020). Everyone is invited to participate in commenting on those documents when the call comes out, to improve them and make them best represent community opinion. If anyone needs any of these documents before it is ready for public review, let me know by email and I can point you at a pre-review draft.

References Bloom DA, Wieczorek JR, Zermoglio PF 2020. Georeferencing Calculator Manual. Copenhagen: GBIF Secretariat. https://doi.org/10.35035/gdwq-3v93 Chapman AD and Wieczorek J. 2020. Georeferencing Best Practices. Copenhagen: GBIF Secretariat. https://doi.org/10.15468/doc-gg7h-s853 Zermoglio PF, Chapman AD, Wieczorek JR, Luna MC, Bloom DA 2020. Georeferencing Quick Reference Guide. Copenhagen: GBIF Secretariat. https://doi.org/10.35035/e09p-h128