ropensci / CoordinateCleaner

Automated flagging of common spatial and temporal errors in biological and palaeontological collection data, for the use in conservation, ecology and palaeontology.
https://docs.ropensci.org/CoordinateCleaner/
79 stars 21 forks source link

Error in clean_coordinates functions #43

Closed KunalArekar closed 4 years ago

KunalArekar commented 4 years ago

Hi Alex I am using the following code to clean the GBIF data. This is according to the tutorial (https://ropensci.github.io/CoordinateCleaner/articles/Cleaning_GBIF_data_with_CoordinateCleaner.html)

library(devtools) library(dplyr) library(ggplot2) library(rgbif) library(sp) library(countrycode) library(CoordinateCleaner) dat <- occ_search(scientificName = "Macaca radiata", limit = 5000, return = "data", hasCoordinate = T)

dat <- dat %>% dplyr::select(species, decimalLongitude, decimalLatitude, countryCode, individualCount, gbifID, family, taxonRank, coordinateUncertaintyInMeters, year, basisOfRecord, institutionCode, datasetName)

dat <- dat%>% filter(!is.na(decimalLongitude))%>% filter(!is.na(decimalLatitude))

wm <- borders("world", colour="gray50", fill="gray50") ggplot()+ coord_fixed()+ wm + geom_point(data = dat, aes(x = decimalLongitude, y = decimalLatitude), colour = "darkred", size = 0.5)+ theme_bw()

dat$countryCode <- countrycode(dat$countryCode, origin = 'iso2c', destination = 'iso3c')

dat <- data.frame(dat) flags <- clean_coordinates(x = dat, lon = "decimalLongitude", lat = "decimalLatitude", countries = "countryCode", species = "species", tests = c("capitals", "centroids", "equal","gbif", "institutions", "zeros"))

After the last code (highlighted above) I get the following error

"Testing country identity x[i, ] is invalid Error in RGEOSBinTopoFunc(spgeom1, spgeom2, byid, id, drop_lower_td, unaryUnion_if_byid_false, : TopologyException: Input geom 0 is invalid: Ring Self-intersection at or near point 78.719726559999998 31.887646480000001 at 78.719726559999998 31.887646480000001"

Can you please help me in resolving this.

Thanks Kunal

azizka commented 4 years ago

Hi Kunal,

Thanks for reporting. This was due to invalid polygons resulting from internal cropping in cc_coun. It should be fixed as of version 2.0-17. This is working for me now:

library(devtools)
library(dplyr)
library(ggplot2)
library(rgbif)
library(sp)
library(countrycode)
library(CoordinateCleaner)
dat <- occ_search(scientificName = "Macaca radiata", limit = 5000,
                  return = "data", hasCoordinate = T)$data

dat <- dat %>%
  dplyr::select(species, decimalLongitude, decimalLatitude, countryCode, individualCount,
                gbifID, family, taxonRank, coordinateUncertaintyInMeters, year,
                basisOfRecord, institutionCode, datasetName)

dat <- dat%>%
  filter(!is.na(decimalLongitude))%>%
  filter(!is.na(decimalLatitude))

wm <- borders("world", colour="gray50", fill="gray50")
ggplot()+ coord_fixed()+ wm +
  geom_point(data = dat, aes(x = decimalLongitude, y = decimalLatitude),
             colour = "darkred", size = 0.5)+
  theme_bw()

dat$countryCode <- countrycode(dat$countryCode, origin = 'iso2c', destination = 'iso3c')

dat <- data.frame(dat)
flags <- clean_coordinates(x = dat, lon = "decimalLongitude", lat = "decimalLatitude",
                           countries = "countryCode",
                           species = "species",
                           tests = c("capitals", "centroids", "equal","gbif", "institutions",
                                     "zeros"))

Please let me know if it is working for you.

Cheers,

Alex

KunalArekar commented 4 years ago

Hi Alex

Thank you for letting me know about it. I will try this code, hopefully, it won't give error this time.

Cheers!

Kunal Arekar Prof. Karanth's lab CES, IISc Bangalore

On Tue, Aug 18, 2020 at 12:57 AM Alexander Zizka notifications@github.com wrote:

Hi Kunal,

Thanks for reporting. This was due to invalid polygons resulting from internal cropping in cc_coun. It should be fixed as of version 2.0-17. This is working for me now:

library(devtools) library(dplyr) library(ggplot2) library(rgbif) library(sp) library(countrycode) library(CoordinateCleaner) dat <- occ_search(scientificName = "Macaca radiata", limit = 5000, return = "data", hasCoordinate = T)$data

dat <- dat %>% dplyr::select(species, decimalLongitude, decimalLatitude, countryCode, individualCount, gbifID, family, taxonRank, coordinateUncertaintyInMeters, year, basisOfRecord, institutionCode, datasetName)

dat <- dat%>% filter(!is.na(decimalLongitude))%>% filter(!is.na(decimalLatitude))

wm <- borders("world", colour="gray50", fill="gray50") ggplot()+ coord_fixed()+ wm + geom_point(data = dat, aes(x = decimalLongitude, y = decimalLatitude), colour = "darkred", size = 0.5)+ theme_bw()

dat$countryCode <- countrycode(dat$countryCode, origin = 'iso2c', destination = 'iso3c')

dat <- data.frame(dat) flags <- clean_coordinates(x = dat, lon = "decimalLongitude", lat = "decimalLatitude", countries = "countryCode", species = "species", tests = c("capitals", "centroids", "equal","gbif", "institutions", "zeros"))

Please let me know if it is working for you.

Cheers,

Alex

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ropensci/CoordinateCleaner/issues/43#issuecomment-675322498, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQBA3VZWKKB6CHEMZMSTCQTSBIX7PANCNFSM4OEBL4IA .