@jhpoelen I think this at least moves in the right direction by breaking things out a bit and clarifying a lot of subsequent behavior.
The key change here is that store("https://example.com") still writes the url location to the local registry.tsv.gz, but it does not write the local store path there -- instead, it writes that to a BagIt manifest-sha256.txt per RFC8493.
This has downstream consequences for other functions. For one, perhaps it clarifies what retrieve() means? I have the un-exported function, store_retrieve(), which retrieves stored content given a content hash. (Perhaps you want it to take a URL, but I'm not sure what that would mean as far as being able to add 'local files' to the store that don't have a URL).
Also, I haven't changed the behavior of query() to look for the local store, it only accesses the hash-archive.org registry and the local registry.tsv.gz. This means it can only find URLs associated with content. Not sure if this is desirable, but it does draw a cleaner line between the store and registry.
Anyway, I know you're busy but very much looking forward to your comments on this.
(Unrelated news: I think this should also fix the broken test suite since I uploaded that classic data-set to Zenodo and use that URL instead).
This also includes the stuff from #11, which was really just a very minor PR relative to master, dropping the use of mime type guessing as you suggested; and briefly updating the README to use the current suite of verbs.
@jhpoelen I think this at least moves in the right direction by breaking things out a bit and clarifying a lot of subsequent behavior.
The key change here is that
store("https://example.com")
still writes theurl
location to the localregistry.tsv.gz
, but it does not write the local store path there -- instead, it writes that to a BagItmanifest-sha256.txt
per RFC8493.This has downstream consequences for other functions. For one, perhaps it clarifies what
retrieve()
means? I have the un-exported function,store_retrieve()
, which retrieves stored content given a content hash. (Perhaps you want it to take a URL, but I'm not sure what that would mean as far as being able to add 'local files' to the store that don't have a URL).Also, I haven't changed the behavior of
query()
to look for the local store, it only accesses the hash-archive.org registry and the local registry.tsv.gz. This means it can only find URLs associated with content. Not sure if this is desirable, but it does draw a cleaner line between the store and registry.Anyway, I know you're busy but very much looking forward to your comments on this.
master
, dropping the use of mime type guessing as you suggested; and briefly updating the README to use the current suite of verbs.