ropensci / taxadb

:package: Taxonomic Database
https://docs.ropensci.org/taxadb
Other
43 stars 13 forks source link

Cannot create database #106

Open joelnitta opened 2 years ago

joelnitta commented 2 years ago

td_create() seems to be broken at the moment:

library(taxadb)
td_create("col", overwrite = TRUE)
#> Error: rapi_startup: Failed to open database: Serialization Error: Attempting to read a required field, but field is missing

Created on 2022-07-11 by the reprex package (v2.0.1)

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.2.0 (2022-04-22) #> os macOS Big Sur/Monterey 10.16 #> system x86_64, darwin17.0 #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz Asia/Tokyo #> date 2022-07-11 #> pandoc 2.17.1.1 @ /usr/local/bin/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> ! package * version date (UTC) lib source #> P arkdb 0.0.15 2022-02-15 [?] CRAN (R 4.2.0) #> askpass 1.1 2019-01-13 [2] CRAN (R 4.2.0) #> assertthat 0.2.1 2019-03-21 [2] CRAN (R 4.2.0) #> bit 4.0.4 2020-08-04 [2] CRAN (R 4.2.0) #> bit64 4.0.5 2020-08-30 [2] CRAN (R 4.2.0) #> blob 1.2.3 2022-04-10 [2] CRAN (R 4.2.0) #> cachem 1.0.6 2021-08-19 [2] CRAN (R 4.2.0) #> cli 3.3.0 2022-04-25 [2] CRAN (R 4.2.0) #> P contentid 0.0.15 2021-11-29 [?] CRAN (R 4.2.0) #> crayon 1.5.1 2022-03-26 [2] CRAN (R 4.2.0) #> curl 4.3.2 2021-06-23 [2] CRAN (R 4.2.0) #> DBI 1.1.3 2022-06-18 [2] CRAN (R 4.2.0) #> dbplyr 2.1.1 2021-04-06 [2] CRAN (R 4.2.0) #> P digest 0.6.29 2021-12-01 [?] CRAN (R 4.2.0) #> dplyr 1.0.9 2022-04-28 [2] CRAN (R 4.2.0) #> duckdb 0.4.0 2022-07-06 [2] local #> ellipsis 0.3.2 2021-04-29 [2] CRAN (R 4.2.0) #> P evaluate 0.15 2022-02-18 [?] CRAN (R 4.2.0) #> fansi 1.0.3 2022-03-24 [2] CRAN (R 4.2.0) #> P fastmap 1.1.0 2021-01-25 [?] CRAN (R 4.2.0) #> P fs 1.5.2 2021-12-08 [?] CRAN (R 4.2.0) #> generics 0.1.2 2022-01-31 [2] CRAN (R 4.2.0) #> P glue 1.6.2 2022-02-24 [?] CRAN (R 4.2.0) #> P highr 0.9 2021-04-16 [?] CRAN (R 4.2.0) #> hms 1.1.1 2021-09-26 [2] CRAN (R 4.2.0) #> P htmltools 0.5.2 2021-08-25 [?] CRAN (R 4.2.0) #> httr 1.4.3 2022-05-04 [2] CRAN (R 4.2.0) #> P jsonlite 1.8.0 2022-02-22 [?] CRAN (R 4.2.0) #> P knitr 1.39 2022-04-26 [?] CRAN (R 4.2.0) #> lifecycle 1.0.1 2021-09-24 [2] CRAN (R 4.2.0) #> P magrittr 2.0.3 2022-03-30 [?] CRAN (R 4.2.0) #> memoise 2.0.1 2021-11-26 [2] CRAN (R 4.2.0) #> openssl 2.0.2 2022-05-24 [2] CRAN (R 4.2.0) #> pillar 1.7.0 2022-02-01 [2] CRAN (R 4.2.0) #> pkgconfig 2.0.3 2019-09-22 [2] CRAN (R 4.2.0) #> prettyunits 1.1.1 2020-01-24 [2] CRAN (R 4.2.0) #> progress 1.2.2 2019-05-16 [2] CRAN (R 4.2.0) #> purrr 0.3.4 2020-04-17 [2] CRAN (R 4.2.0) #> R.cache 0.15.0 2021-04-30 [2] CRAN (R 4.2.0) #> R.methodsS3 1.8.1 2020-08-26 [2] CRAN (R 4.2.0) #> R.oo 1.24.0 2020-08-26 [2] CRAN (R 4.2.0) #> R.utils 2.11.0 2021-09-26 [2] CRAN (R 4.2.0) #> P R6 2.5.1 2021-08-19 [?] CRAN (R 4.2.0) #> P rappdirs 0.3.3 2021-01-31 [?] CRAN (R 4.2.0) #> Rcpp 1.0.8.3 2022-03-17 [2] CRAN (R 4.2.0) #> readr 2.1.2 2022-01-30 [2] CRAN (R 4.2.0) #> reprex 2.0.1 2021-08-05 [2] CRAN (R 4.2.0) #> P rlang 1.0.3 2022-06-27 [?] CRAN (R 4.2.0) #> P rmarkdown 2.14 2022-04-25 [?] CRAN (R 4.2.0) #> RSQLite 2.2.14 2022-05-07 [2] CRAN (R 4.2.0) #> sessioninfo 1.2.2 2021-12-06 [2] CRAN (R 4.2.0) #> P stringi 1.7.6 2021-11-29 [?] CRAN (R 4.2.0) #> P stringr 1.4.0 2019-02-10 [?] CRAN (R 4.2.0) #> styler 1.7.0 2022-03-13 [2] CRAN (R 4.2.0) #> P taxadb * 0.1.5 2022-07-11 [?] Github (ropensci/taxadb@c26a400) #> tibble 3.1.7 2022-05-03 [2] CRAN (R 4.2.0) #> tidyselect 1.1.2 2022-02-21 [2] CRAN (R 4.2.0) #> tzdb 0.3.0 2022-03-28 [2] CRAN (R 4.2.0) #> utf8 1.2.2 2021-07-24 [2] CRAN (R 4.2.0) #> vctrs 0.4.1 2022-04-13 [2] CRAN (R 4.2.0) #> withr 2.5.0 2022-03-03 [2] CRAN (R 4.2.0) #> P xfun 0.31 2022-05-10 [?] CRAN (R 4.2.0) #> P yaml 2.3.5 2022-02-21 [?] CRAN (R 4.2.0) #> #> [1] /Users/joelnitta/repos/presentations/botany_2022_taxastand/renv/library/R-4.2/x86_64-apple-darwin17.0 #> [2] /Library/Frameworks/R.framework/Versions/4.2/Resources/library #> #> P ── Loaded and on-disk path mismatch. #> #> ────────────────────────────────────────────────────────────────────────────── ```
cboettig commented 2 years ago

thanks for reporting. to be specific, this problem appears to be related to some change in how overwrite interacts with a database created by an older version of duckdb when you have upgraded duckdb (? i think).

If you try this with an empty taxadb dir, or just nuke your taxadb dir first, I think it should work?

taxadb::taxadb_dir() |> fs::dir_delete()
td_create("col", overwrite = TRUE)

Will have to think of a better workflow for this. in general the update / backwards compatibility with duckdb is an issue. (duckdb usually warns about this, but I think the latest duckdb release was intended to be backward compatible; though maybe not entirely?)

Longer term, I'd like to switch to a parquet-backed store, which means we could skip the whole import process and also not have to worry about backward-compatibility with duckdb's versions. Ideally we'll be able to do this using direct remote support, so users could even opt out of the download step! (though for most use cases that would probably not be the most performant choice) but haven't gotten around to it...