bio-guoda / preston

a biodiversity dataset tracker
MIT License
24 stars 1 forks source link

preston fails to resolve known embedded local content #258

Open jhpoelen opened 1 year ago

jhpoelen commented 1 year ago

when working on https://github.com/bio-guoda/preston-query/tree/main/examples/page2023 , I wrote:

<https://zenodo.org/record/7181181/files/triples.nt.zip> <http://purl.org/pav/hasVersion> <hash://md5/09f2c4f961f2aca469d5f3a36009938d> .
<zip:https://zenodo.org/record/7181181/files/triples.nt.zip!/triples.nt> <http://www.w3.org/ns/prov#type> "application/n-triples" .
<zip:hash://md5/09f2c4f961f2aca469d5f3a36009938d!/triples.nt> <http://www.w3.org/ns/prov#type> "application/n-triples" .

<zip:hash://md5/09f2c4f961f2aca469d5f3a36009938d!/triples.nt> <http://purl.org/pav/hasVersion> <hash://md5/99e0fff7811d049be63c1fb7c51f6041> .

piped this into a provenance graph

cat [content] > preston track --algo md5

then I ran

preston verify --remote https://zenodo.org --algo md5

on this content graph.

Preston was able to resolve content associated with hash://md5/09f2c4f961f2aca469d5f3a36009938d (the zip file on zenodo).

However, Preston wasn't smart enough to try and retrieve the embedded content stated in

<zip:hash://md5/09f2c4f961f2aca469d5f3a36009938d!/triples.nt> <http://purl.org/pav/hasVersion> <hash://md5/99e0fff7811d049be63c1fb7c51f6041> .

even though

preston cat 'zip:hash://md5/09f2c4f961f2aca469d5f3a36009938d!/triples.nt'\
 | md5sum

in fact, yields

99e0fff7811d049be63c1fb7c51f6041

possibly related to #229 .