ropensci / CoordinateCleaner

Automated flagging of common spatial and temporal errors in biological and palaeontological collection data, for the use in conservation, ecology and palaeontology.
https://docs.ropensci.org/CoordinateCleaner/
79 stars 21 forks source link

Number of records detected: difference between the number for each test and the summary #62

Open luroy opened 2 years ago

luroy commented 2 years ago

Dear all,

I’ve just started using CoordinateCleaner to flag and remove problematic records from the Gbif package. So I apologize in advance if my following comment is due to my unfamiliarity with this package.

From a given input (GBIF records of species occurrence), I ran the clean_coordinates function as follows:

flags<-clean_coordinates(gbif_data_df, lon = "decimalLongitude", lat = "decimalLatitude", countries = "countryCode2", species = "species", tests = c("capitals", "centroids", "duplicated", "equal", "gbif", "institutions", "outliers", "zeros"))

Here, I obtained the following results in my R console:

Testing coordinate validity Flagged 0 records. Testing equal lat/lon Flagged 0 records. Testing zero coordinates Flagged 0 records. Testing country capitals Flagged 10 records. Testing country centroids Flagged 3 records. Testing geographic outliers Flagged 44 records. Testing GBIF headquarters, flagging records around Copenhagen Flagged 0 records. Testing biodiversity institutions Flagged 0 records. Flagged 10 of 12187 records, EQ = 0

summary(flags) .val .equ .zer .cap .cen .otl .gbf .inst 0 0 0 10 3 0 0 0 .summary 10

If the 3 "centroids" flagged records seems to be comprised within the 10 "country capitals" records, I don't understand why the "geographic outliers" flagged records are not shown in the summary, nor in the "flags" object.

Thank you for your attention to this matter,

Léa