ropensci / rgbif

Interface to the Global Biodiversity Information Facility API
https://docs.ropensci.org/rgbif
Other
155 stars 50 forks source link

occ_download_get() downloads empty file even though occ_download_meta() shows non null number of records #742

Open damianobaldan opened 4 months ago

damianobaldan commented 4 months ago

Command used to query:

occ_download(
  pred_within(polygon_med),
  pred_not(pred_in("taxonKey", c(212,6)) ) , # remove birds and plants
  pred_gte("year", 2000),
  pred("hasGeospatialIssue", FALSE),
  pred("hasCoordinate", TRUE),
  pred("occurrenceStatus","PRESENT")
)

When interrogated on query metadata the number of records is non null

occ_download_meta('0020258-240626123714530')

<<gbif download metadata>>
  Status: SUCCEEDED
  DOI: 10.15468/dl.w3t8rf
  Format: DWCA
  Download key: 0020258-240626123714530
  Created: 2024-07-11T06:55:46.601+00:00
  Modified: 2024-07-11T07:18:52.487+00:00
  Download link: https://api.gbif.org/v1/occurrence/download/request/0020258-240626123714530.zip
  Total records: 9535624

Command to load the dataset returns empty data.frame:

d <- occ_download_get('0020258-240626123714530') %>%
  occ_download_import()

nrow(d)
[1] 0

My session info:

R version 4.3.1 (2023-06-16 ucrt) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale: [1] LC_COLLATE=Italian_Italy.utf8 LC_CTYPE=Italian_Italy.utf8 LC_MONETARY=Italian_Italy.utf8 LC_NUMERIC=C LC_TIME=Italian_Italy.utf8

time zone: Europe/Rome tzcode source: internal

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] worrms_0.4.3 sf_1.0-14 data.table_1.14.8 finch_0.4.0 lubridate_1.9.2 forcats_1.0.0 stringr_1.5.0 dplyr_1.1.2 purrr_1.0.1 readr_2.1.4 tidyr_1.3.0
[12] tibble_3.2.1 ggplot2_3.5.0 tidyverse_2.0.0 obistools_0.1.0 rgbif_3.7.7 robis_2.11.3

loaded via a namespace (and not attached): [1] DBI_1.2.2 remotes_2.4.2.1 rlang_1.1.1 magrittr_2.0.3 e1071_1.7-13 compiler_4.3.1 callr_3.7.3 vctrs_0.6.3 profvis_0.3.8 httpcode_0.3.0 pkgconfig_2.0.3
[12] crayon_1.5.2 fastmap_1.1.1 mapedit_0.6.0 ellipsis_0.3.2 utf8_1.2.3 promises_1.2.0.1 rmarkdown_2.23 sessioninfo_1.2.2 tzdb_0.4.0 ps_1.7.5 xfun_0.39
[23] cachem_1.0.8 jsonlite_1.8.7 later_1.3.1 uuid_1.1-0 data.tree_1.1.0 prettyunits_1.1.1 R6_2.5.1 stringi_1.7.12 hoardr_0.5.4 pkgload_1.3.2.1 Rcpp_1.0.10
[34] knitr_1.43 usethis_2.2.2 triebeard_0.4.1 httpuv_1.6.11 timechange_0.2.0 tidyselect_1.2.1 rstudioapi_0.15.0 yaml_2.3.7 miniUI_0.1.1.1 curl_5.2.0 processx_3.8.2
[45] pkgbuild_1.4.2 plyr_1.8.8 shiny_1.7.4.1 withr_3.0.0 evaluate_0.21 urlchecker_1.0.1 units_0.8-2 proxy_0.4-27 xml2_1.3.5 pillar_1.9.0 whisker_0.4.1
[56] KernSmooth_2.23-21 generics_0.1.3 hms_1.1.3 munsell_0.5.0 scales_1.3.0 xtable_1.8-4 class_7.3-22 glue_1.6.2 jsonld_2.2 lazyeval_0.2.2 tools_4.3.1
[67] fs_1.6.3 grid_4.3.1 crosstalk_1.2.0 urltools_1.7.3 jqr_1.3.3 devtools_2.4.5 colorspace_2.1-0 cli_3.6.1 emld_0.5.1 rappdirs_0.3.3 fansi_1.0.4
[78] V8_4.4.2 gtable_0.3.4 EML_2.0.6.1 oai_0.4.0 digest_0.6.33 classInt_0.4-9 crul_1.4.0 htmlwidgets_1.6.2 memoise_2.0.1 htmltools_0.5.5 lifecycle_1.0.4
[89] leaflet_2.1.2 httr_1.4.7 mime_0.12

Session Info ```r ```
jhnwllr commented 4 months ago

@damianobaldan There might be an pre-existing bug in not downloads. If it is possible to do the download without pred_not, I would do that for the time being. https://github.com/gbif/portal-feedback/issues/5347