LimaRAF / plantR

An R Package for Managing Species Records from Biological Collections
GNU General Public License v3.0
18 stars 4 forks source link

FormatDwc_CoordinatePrecision!? #64

Closed herisonmedeiros closed 3 years ago

herisonmedeiros commented 3 years ago

The Cardiospermum (Sapindaceae) had another erros in the formatDwc function: including and not including the option "drop = TRUE", that descart columns that are not congruent beteween the databases:

occs_splink <- rspeciesLink(filename = "Cardiospermum_teste_splink.txt", save = TRUE, basisOfRecord = 'PreservedSpecimen', species = "Cardiospermum") occs_gbif <- rgbif2(filename = "Cardiospermum_teste_gbif.txt", species = "Cardiospermum", n.records = 110000, force = TRUE, save = TRUE) occs <- formatDwc(splink_data = occs_splink, gbif_data = occs_gbif, drop = TRUE) occs <- formatDwc(splink_data = occs_splink, gbif_data = occs_gbif) Error: Can't combine gbif$coordinatePrecision and speciesLink$coordinatePrecision . Run rlang::last_error() to see where the error occurred. In addition: Warning messages: 1: some columns in splink_data do not follow the speciesLink pattern 2: some columns in gbif_data does not follow the gbif pattern!

After, run this code to see more details and it appeared:

rlang::last_error() <error/vctrs_error_incompatible_type> Can't combine gbif$coordinatePrecision and speciesLink$coordinatePrecision . Backtrace:

  1. plantR::formatDwc(splink_data = occs_splink, gbif_data = occs_gbif)
  2. dplyr::bind_rows(res_list, .id = "data_source")
  3. vctrs::vec_rbind(!!!dots, .names_to = .id)
  4. vctrs::vec_default_ptype2(...)
  5. vctrs::stop_incompatible_type(...)
  6. vctrs:::stop_incompatible(...)
  7. vctrs:::stop_vctrs(...)
LimaRAF commented 3 years ago

Dear @herisonmedeiros, thanks again for this issue.

The error you mentioned was related to the bind of dataframes differing in the class of some columns (e.g. numeric vs. characters). We had a list of variables that should be converted before the bind, but the list was not exhaustive. I automatized the recognition of such columns to make sure they are all converted. The problem should be solved with the last commit to the development branch of the package.

I used the code below, which worked as expected in my machine:

# Downloading the data
occs_splink <- rspeciesLink(species = "Cardiospermum",
                            basisOfRecord = 'PreservedSpecimen')
occs_gbif <- rgbif2(species = "Cardiospermum", force = TRUE)
# Combining the data
occs <- formatDwc(splink_data = occs_splink, gbif_data = occs_gbif,
                  drop = TRUE)
occs <- formatDwc(splink_data = occs_splink, gbif_data = occs_gbif)

Note that for some reason GBIF did not returned one important column: "typeStatus". I have never seen it before and it may be something related to this specific taxa. In the merged dataset (occs), the column is there but for all GBIF records the information is NA.

Please let us know if this solves the error in your machine as well.