cboettig / contentid

:package: R package for working with Content Identifiers
http://cboettig.github.io/contentid
Other
46 stars 2 forks source link

Do not guess type, don't track type #11

Closed cboettig closed 4 years ago

cboettig commented 4 years ago

@jhpoelen just a small PR here to hit some of the easiest fixes mentioned in #9.

As noted in #9, it's probably best to drop type for now. Guessing by mime type alone is really not desirable, and we're not using this information in any way in the core implementation. type can probably be tracked in richer form in accompanying metadata/prov record for the content.

This also lets us drop a dependency (mime).

Also added a few minor/easy fixes on naming things. (query argument should be consistently called uri across query, query_local, and query_remote; dublincore format renamed to indicate it is just a formatter for hash-archive.org's return type.

more to come when I get a chance.

cboettig commented 4 years ago

haha all checks are now failing because the URL I use in most of the tests, "http://cdiac.ornl.gov/ftp/trends/co2/vostok.icecore.co2" is now 503 error .

jhpoelen commented 4 years ago

@cboettig awesome! Luckily, you still have a copy right? PS Am planning to review your pull request today, am a bit hurried because of https://parasitetracker.org workshop prep for next week.

cboettig commented 4 years ago

@jhpoelen yup! And since I registered the original url in hash-archive.org, two clicks gives you the SHA that once existed at that URL: https://hash-archive.org/history/http://cdiac.ornl.gov/ftp/trends/co2/vostok.icecore.co2 and by clicking that identifier, a URL of one of my backup copies: https://hash-archive.org/sources/hash://sha256/9412325831dab22aeebdd674b6eb53ba6b7bdd04bb99a4dbb21ddff646287e37

(Or using our package):

query("https://hash-archive.org/history/http://cdiac.ornl.gov/ftp/trends/co2/vostok.icecore.co2")

query("hash://sha256/9412325831dab22aeebdd674b6eb53ba6b7bdd04bb99a4dbb21ddff646287e37")