ropensci / CoordinateCleaner

Automated flagging of common spatial and temporal errors in biological and palaeontological collection data, for the use in conservation, ecology and palaeontology.
https://docs.ropensci.org/CoordinateCleaner/
79 stars 21 forks source link

Error in replacing data #35

Closed JuliaNiemeyer closed 4 years ago

JuliaNiemeyer commented 4 years ago

This only happens to one of the species I am working with (so far). Apparently there is nothing wrong with the input data.

This is the code I am using:

geo.clean <- clean_coordinates(x = merge_coord, lon = "decimalLongitude", lat = "decimalLatitude", species = "species", value = "clean")

And the error:

Flagged 120 records. Testing geographic outliers Flagged 121 records. Error in $<-.data.frame(*tmp*, "otl", value = c(TRUE, TRUE, TRUE, : replacement has 552 rows, data has 542

Don't know how to proceed.

azizka commented 4 years ago

Hi,

hard to tell. Could you provide the example data?

Cheers,

Alex

JuliaNiemeyer commented 4 years ago

Hi Alex, thank you for your reply. You can clone my repo https://github.com/JuliaNiemeyer/Data_cleaning I used an exercise from a teacher I had as a base, but I'm creating the SpeciesLink_Clean.R where I aim to get occurrence points from SPLink and GBIF as well, and clean all the points automatically for all species.

So you may find the function where clean_coordinate inside the ./SpeciesLink/SpeciesLink_Clean.R You may find and example data (species_ex.csv) inside ./data/ folder.

Thank you again. Julia

Em qua., 15 de abr. de 2020 às 04:32, Alexander Zizka < notifications@github.com> escreveu:

Hi,

hard to tell. Could you provide the example data?

Cheers,

Alex

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ropensci/CoordinateCleaner/issues/35#issuecomment-613868663, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMRGCJZQOS4MTL2OA5QBOP3RMVPINANCNFSM4MH6KBEA .

-- Julia de Niemeyer Caldas

Laboratório de Vertebrados - Depto Ecologia - IB - UFRJ Rede Brasileira de Pesquisas sobre Mudanças Climáticas Globais - Rede Clima/INPE/MCTIC Lattes: http://lattes.cnpq.br/7166082149089652

azizka commented 4 years ago

SO, this problem is the filtering step removing NAs in line 107 of your script. clean_coordiantes uses rownames to match the results from the different tests.

Your filtering step removes rows, but keeps rownames, so that there is a mismatch, which causes this problem. This is not ideal, for now, you can fix this by either using the individual cc_* functions or by resetting the rownames after you filtered out the NAs. To do so add rownames(merge_coord ) <- NULL after you filtered out the NAs. Hope that helps for now.

JuliaNiemeyer commented 4 years ago

Hello Alex, Yes!! Thank you very very much.As a matter of fact, I realized that was the problem today! Although I didn't get to the point of trying and fix it. You really helped me. I was just studying to make a reprex to send you. I only got to do it now. I'm so sorry if I gave you too much work, but I didn't know what a reprex was and how to make one. Now I know.

Thank you for everything, my script works fine now (until the next problem hehe). Take care, Julia

Em seg., 4 de mai. de 2020 às 17:14, Alexander Zizka < notifications@github.com> escreveu:

SO, this problem is the filtering step removing NAs in line 107 of your script. clean_coordiantes uses rownames to match the results from the different tests.

Your filtering step removes rows, but keeps rownames, so that there is a mismatch, which causes this problem. This is not ideal, for now, you can fix this by either using the individual cc_* functions or by resetting the rownames after you filtered out the NAs. To do so add rownames(merge_coord ) <- NULL after you filtered out the NAs. Hope that helps for now.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ropensci/CoordinateCleaner/issues/35#issuecomment-623681337, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMRGCJ4AMTKKQMRFZ5YNGE3RP4OZTANCNFSM4MH6KBEA .

-- Julia de Niemeyer Caldas

Laboratório de Vertebrados - Depto Ecologia - IB - UFRJ Rede Brasileira de Pesquisas sobre Mudanças Climáticas Globais - Rede Clima/INPE/MCTIC Lattes: http://lattes.cnpq.br/7166082149089652

azizka commented 4 years ago

No worries, thanks for reporting! The dependencies on the row names should be removed from clean_coordinates. Closing this issue now.