AtlasOfLivingAustralia / DataQuality

Data Quality
0 stars 0 forks source link

Consider raising the Coordinate uncertainty threshhold from 10km to 30km #244

Closed timhicks-ala closed 1 year ago

timhicks-ala commented 3 years ago

The ALA General data profile is currently configured to exclude records where coordinateUncertaintyInMeters is > 10000 (10km). This excludes records like this one, for example: https://biocache.ala.org.au/occurrences/search?q=qid%3A1629692518956&qualityProfile=ALA

That record's coordinate uncertainty is 28.7km, because Geoprivacy was enabled for it on iNaturalist, which creates a 0.2 by 0.2 degree area around the sighting, leading to (anecdotally) just under 30km of uncertainty. There are over 200k iNaturalist records in the ALA currently excluded on this criteria: https://biocache.ala.org.au/occurrences/search?q=data_resource_uid%3Adr1411&qualityProfile=ALA&disableAllQualityFilters=true&fq=coordinateUncertaintyInMeters%3A%5B10001+TO+*%5D#tab_recordsView

I would suggest raising the coordinateUncertaintyInMeters threshhold in the General data profile to 30km. 95% of records in a sample of ~1300 records had coordinate uncertainy below 30km, so this value would cover the majority of these records, and while it's not immediately clear why some iNat records have values higher than this (some in my sample went to 200,000km), on the assumption records that opt in to Geoprivacy are under 30km, it should actually resolve 100% of Geoprivacy-related records being obscured by the default profile.

Raising this threshhold would of course have implications in other areas so I will leave that discussion to others.

This question was raised in https://support.ehelp.edu.au/a/tickets/114050

peggynewman commented 1 year ago

I've decided not to pursue this. It's sensitive species that are obfuscated in iNaturalist to a 0.2 grid, and we want for users to understand that those records are missing because something has happened to them, and have to click through 'Excluded due to spatial uncertainty'.