cboettig / contentid

:package: R package for working with Content Identifiers
http://cboettig.github.io/contentid
Other
46 stars 2 forks source link

Separate local registries for URLs and local paths #17

Closed cboettig closed 4 years ago

cboettig commented 4 years ago

@jhpoelen I think this at least moves in the right direction by breaking things out a bit and clarifying a lot of subsequent behavior.

The key change here is that store("https://example.com") still writes the url location to the local registry.tsv.gz, but it does not write the local store path there -- instead, it writes that to a BagIt manifest-sha256.txt per RFC8493.

This has downstream consequences for other functions. For one, perhaps it clarifies what retrieve() means? I have the un-exported function, store_retrieve(), which retrieves stored content given a content hash. (Perhaps you want it to take a URL, but I'm not sure what that would mean as far as being able to add 'local files' to the store that don't have a URL).

Also, I haven't changed the behavior of query() to look for the local store, it only accesses the hash-archive.org registry and the local registry.tsv.gz. This means it can only find URLs associated with content. Not sure if this is desirable, but it does draw a cleaner line between the store and registry.

Anyway, I know you're busy but very much looking forward to your comments on this.