Closed juanmayorgahenao closed 1 year ago
yes
Thanks @cboettig. I'm struggling to make dowloaded tables available offline.
For example, after running rfishbase::fb_import(tables = "estimate")
, I get this message <duckdb_connection 15820 driver=<duckdb_driver c9030 dbdir=':memory:' read_only=FALSE>>
. The dbdir=':memory:'
part makes me suspect the data is only store in memory. I could be wrong of course.
If I then disconnect from the internet, and run rfishbase::estimate("Sphyrna mokarran")
I get the error message:
Error in rbind(deparse.level, ...) :
numbers of columns of arguments do not match In addition: Warning messages: 1: Error in curl::curl_fetch_memory(file, handle): Could not resolve host: hash-archive.org
2: Error in curl::curl_fetch_memory(file, handle): Could not resolve host: hash-archive.carlboettiger.info
3: Error in curl::curl_fetch_memory(file, handle): Could not resolve host: hash-archive.org
4: Error in curl::curl_fetch_memory(file, handle): Could not resolve host: hash-archive.carlboettiger.info
5: In curl::curl_fetch_memory(url, handle = handle) : Could not resolve host: archive.softwareheritage.org 6: In curl::curl_fetch_memory(url, handle = handle) : Could not resolve host: cn.dataone.org
I also inspected the output of rfishbase::db_dir()
and it seems to be in the right place Library/Application Support/org.R-project.R/R/rfishbase
.
Any thoughts on where the issue might be?
Thank you
Thanks for the report.
The first part,
<duckdb_connection 15820 driver=<duckdb_driver c9030 dbdir=':memory:' read_only=FALSE>>
is expected. rfishbase
uses duckdb
only to read the static parquet files that fb_import
has downloaded locally to your computer. duckdb
should have no need for an additional on-disk database storage, because it can use parquet as the on-disk database.
However, after you disconnect from the internet, it seems that for some reason it fails to find the local copy, and so goes looking for the copy on the internet. We'll have to debug why that is failing for you. Can you first make sure you have the most recent version of contentid
and rfishbase
?
fb_import
uses the Provenance log to determine the identifier for a particular table. e.g. here is the entry for the most recent version of the estimate table: https://github.com/ropensci/rfishbase/blob/6ebd80e92a93366ce6b159eadd94c5e47f06d31e/inst/prov/fb.prov#L653-L660
See if you can resolve that id directly offline as well as online:
path <- contentid::resolve("hash://sha256/7f258428dadc8031f5e8111ab088d4f3b00130b1985b318153a98d2f7cdf2b66", store=TRUE, dir = rfisbase::db_dir() )
path
If this succeeds, you should get a path back that points to rfishbase::db_dir()
, with the file named by it's sha hash (and no file extension). If it is failing offline, then path
will be NA
. Can you try that and let me know?
Thanks for the help and sorry for the trouble.
Thanks @cboettig. This succeeds online with output:
"/Users/marinedatascience/Library/Application Support/org.R-project.R/R/rfishbase/sha256/7f/25/7f258428dadc8031f5e8111ab088d4f3b00130b1985b318153a98d2f7cdf2b66"
but fails offline with error message:
`Warning: Error in curl::curl_fetch_memory(file, handle): LibreSSL SSL_read: error:02FFF03C:system library:func(4095):Operation timed out, errno 60
Warning: Error in curl::curl_fetch_memory(file, handle): LibreSSL SSL_read: error:02FFF03C:system library:func(4095):Operation timed out, errno 60
Warning: Error in curl::curl_fetch_memory(file, handle): Could not resolve host: hash-archive.org
Warning: Error in curl::curl_fetch_memory(file, handle): Could not resolve host: hash-archive.carlboettiger.info
Warning in curl::curl_fetch_memory(url, handle = handle) : Could not resolve host: archive.softwareheritage.org Warning in curl::curl_fetch_memory(url, handle = handle) : Could not resolve host: cn.dataone.org Error in rbind(deparse.level, ...) : numbers of columns of arguments do not match`
Thank you for your help!
I'm working with the a fresh github install of both packages
Thanks, weird, I can't reproduce still, but at least we have this down to a more minimal example. Can you give me the full output of sessionInfo()
after producing this error?
(The warnings are kind-of expected, in that you expect a curl timeout for remote sources, but contentid
should not even try the remote sources since it should find a local copy first. This suggests it is not finding the local copy, even though it's showing up in your local directory.....
This should be a bit more stable with the latest release, which simplifies some of the logic. See updated notes in README. apologies for the trouble earlier!
Session Info
```r ```Hi - I have a brief clarifying question: After using
fb_import
, are dowloaded tables available for use in fresh R sessions without internet connection?Thank you