Open amoeba opened 6 years ago
That seems cool! Would that depend on the reliability/persistence of 'contentUrl' or 'contentUrl' + 'fileName'?
Would there maybe be a way to generate a .bib
as well, to suggest a citation?
👏
I think it might be potentially more robust to have a function that just extracts the metadata and returns an R object which contains the download urls? e.g. something like
x <- import_spice()
read_csv(x$files[[1]]
(Some examples of schema.org Dataset contentUrl
s do not contain direct links to download a data file, but rather a web page that has links).
Could potentially make this behavior part of a read_spice()
function; i.e. read_spice
could work locally on a dataspice.json
object or could extract dataspice.json
from HTML content on the web.
An R object could also contain the citation (perhaps as an R bibitem
object, which R can already turn into either bibtex or text-based citation). i.e. simply x$citation
; or we could have a methods-y interface like citation(x)
@khondula wrote:
That seems cool! Would that depend on the reliability/persistence of 'contentUrl' or 'contentUrl' + 'fileName'?
Yes, I see it as a huge need to resolve this stuff soon. @cboettig 's idea below helps alleviate that (don't fetch the data at first, just metadata) then give the user a way to fetch some or all of it.
returns an R object
Ooh nice! More robust yes.
Could potentially make this behavior part of a read_spice() function; i.e. read_spice could work locally on a dataspice.json object or could extract dataspice.json from HTML content on the web.
👍 and 👍 on all those ideas @cboettig
This comes from a good question in my
dataspice
demo today: If user X authors adataspice
page for their dataset, and another scientist, Y, wants to use it, it'd be cool if they just ran:And their computer downloaded something like
some-dataset.zip
which had thedataspice.json
and the files described inaccess.csv
attached to it somehow.