Open jhnwllr opened 3 years ago
Hi John,
thanks for the excellent suggestion. I'll implement this for the next version. Two questions:
Thanks!!
I don't have any opinions about individualCount right now.
My assumption would be that there might be some default values there. GBIF has recently done a good job of trying to cleaning up that column. Since GBIF now has the occurrence_status field: https://www.gbif.org/occurrence/search?taxon_key=4689&occurrence_status=present
What do you suggest as default name for the column with the uncertainty in meters, since this will be user provided
I would name the issue or column something like "known_default_coordinate_uncertainty"
There are several known default values for coordinate uncertainty in meters.
301 : Geolocate Default (often a country centroid) 3036 : Geolocate Default 999 : Default found in a few datasets (observations.org) 9999 : Large default
occurrence counts 630 353 -- 3036m 401 507 -- 301m 370 553 -- 999m 14 242 -- 9999m
I think CoordinateCleaner could have a function for these filtering these known defaults. I would be happy to make a PR for such a function...
https://github.com/gbif/pipelines/issues/417