ropensci / rdflib

:package: High level wrapper around the redland package for common rdf applications
https://docs.ropensci.org/rdflib
Other
57 stars 9 forks source link

Cannot open on-disk database #18

Closed noamross closed 6 years ago

noamross commented 6 years ago

This may relate to #15, but I can't seem to initialize an on-disk database:

library(rdflib)
my_rdf <- rdf(storage = "BDB", path = "test.rdfdb", new_db = TRUE)
librdf error - BDB V4.1+ open of 'test.rdfdb/rdflib-sp2o.db' failed - No such file or directory
Session info ``` r devtools::session_info() #> Session info ------------------------------------------------------------- #> setting value #> version R version 3.4.3 (2017-11-30) #> system x86_64, linux-gnu #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> tz UTC #> date 2018-03-09 #> Packages ----------------------------------------------------------------- #> package * version date source #> backports 1.1.2 2017-12-13 CRAN (R 3.4.3) #> base * 3.4.3 2018-02-08 local #> commonmark 1.4 2017-09-01 CRAN (R 3.4.3) #> compiler 3.4.3 2018-02-08 local #> curl 3.1 2017-12-12 CRAN (R 3.4.3) #> datasets * 3.4.3 2018-02-08 local #> devtools 1.13.5 2018-02-18 CRAN (R 3.4.3) #> digest 0.6.15 2018-01-28 CRAN (R 3.4.3) #> evaluate 0.10.1 2017-06-24 CRAN (R 3.4.3) #> graphics * 3.4.3 2018-02-08 local #> grDevices * 3.4.3 2018-02-08 local #> hms 0.4.1 2018-01-24 CRAN (R 3.4.3) #> htmltools 0.3.6 2017-04-28 CRAN (R 3.4.3) #> jsonld 1.2 2017-04-11 cran (@1.2) #> jsonlite 1.5 2017-06-01 CRAN (R 3.4.3) #> knitr 1.19 2018-01-29 CRAN (R 3.4.3) #> magrittr 1.5 2014-11-22 cran (@1.5) #> memoise 1.1.0 2017-04-21 CRAN (R 3.4.3) #> methods * 3.4.3 2018-02-08 local #> pillar 1.1.0 2018-01-14 CRAN (R 3.4.3) #> pkgconfig 2.0.1 2017-03-21 CRAN (R 3.4.3) #> R6 2.2.2 2017-06-17 CRAN (R 3.4.3) #> Rcpp 0.12.15 2018-01-20 CRAN (R 3.4.3) #> rdflib * 0.1.0 2018-03-09 Github (ropensci/rdflib@99febba) #> readr 1.1.1 2017-05-16 CRAN (R 3.4.3) #> redland 1.0.17-9 2016-12-15 cran (@1.0.17-) #> rlang 0.1.6 2017-12-21 CRAN (R 3.4.3) #> rmarkdown 1.8 2017-11-17 CRAN (R 3.4.3) #> roxygen2 6.0.1 2017-02-06 CRAN (R 3.4.3) #> rprojroot 1.3-2 2018-01-03 CRAN (R 3.4.3) #> stats * 3.4.3 2018-02-08 local #> stringi 1.1.6 2017-11-17 CRAN (R 3.4.3) #> stringr 1.2.0 2017-02-18 CRAN (R 3.4.3) #> tibble 1.4.2 2018-01-22 CRAN (R 3.4.3) #> tools 3.4.3 2018-02-08 local #> utils * 3.4.3 2018-02-08 local #> V8 1.5 2017-04-25 CRAN (R 3.4.3) #> withr 2.1.1.9000 2018-03-01 Github (r-lib/withr@5d05571) #> xml2 1.2.0 2018-01-24 CRAN (R 3.4.3) #> yaml 2.1.16 2017-12-12 CRAN (R 3.4.3) ```

I have the System Requirements installed:

sudo apt-get install librdf0 librdf0-dev
Reading package lists... Done
Building dependency tree
Reading state information... Done
librdf0 is already the newest version (1.0.17-1.1).
librdf0-dev is already the newest version (1.0.17-1.1).
0 upgraded, 0 newly installed, 0 to remove and 38 not upgraded.

I note I get a similar error on OSX, where I have redland installed via brew:

library(rdflib) #devtools::install_github('ropensci/rdflib')
my_rdf <- rdf(storage = "BDB", path = "ttest", new_db = TRUE)
#>Warning message:
#>    In rdf(storage = "BDB", path = "ttest", new_db = TRUE) :
#>    BDB driver not found. Falling back on in-memory storage
cboettig commented 6 years ago

I think you need to do:

homebrew install berkeley-db

then rebuild redland from source, and then it should work.

On ubuntu/debian it's libbd-dev.

See details in ?rdf and let me know if they are any help. I want to add a vignette or other more helpful docs on the different storage options but haven't got to it yet. Quick word of warning that performance isn't great when you get to the order of a few million triples, even on disk -- the vignette examples using nyflights13 can give a good benchmark case. I'm not sure if that's to be expected; I believe it would be better in virtuoso backend.

No sure why your linux example is throwing an error instead of falling back on the in-memory storage...

cboettig commented 6 years ago

vignette on storage backends coming soon. Will now include using sqlite, postgres, and virtuoso, which might be preferable to BDB anyway (at least more recognizable).