bio-guoda / preston

a biodiversity dataset tracker
MIT License
24 stars 1 forks source link

preston clone https://jhpoelen.nl/bees/data does not copy related resources #221

Closed jhpoelen closed 1 year ago

jhpoelen commented 1 year ago

in following clone instructions on https://jhpoelen.nl/bees

preston clone https://jhpoelen.nl/bees/data

does not clone related assets. Instead only the provenance logs appear to have been copied.

$ find .
.
./data
./data/85
./data/85/13
./data/85/13/85138e506a29fb73099fb050372d8a379794ab57fe4bfdf141743db0de2b985c
./data/2a
./data/2a/5d
./data/2a/5d/2a5de79372318317a382ea9a2cef069780b852b01210ef59e06b640a3539cb5a
jhpoelen commented 1 year ago

expected is that images like will be included in the local data directory.

hash://sha256/8d49bd24f6ba300b4de44fd218b53294f4cc0106cd9631018ef819b38345c75d

jhpoelen commented 1 year ago

After applying fixes resulting from a switch in RDF parsing library, running:

preston clone https://jhpoelen.nl/bees/data

resulted in a data dir with

136 files

$ find -type f | wc -l
136

with first 10 files being

$ find -type f | head 
./data/a4/04/a404059930ec2d03a9e4eb2d2e3d59a10eb19511b4361962d12ff44f29ef28e4
./data/0f/c4/0fc4a09b99d793336f0172cae44634f6fe57376998efe5da1de626f6764a7536
./data/74/c8/74c8924e140957c2bdada368d356ae12222b4a8de48126e12282fd7ba00dd2ef
./data/59/6f/596f73d5e1974511d4efec6256e11ef4ea811365bdd866fd4243200757790f27
./data/5c/8c/5c8cfdedd669d47262b71cb5e6e64ae97afe0126e0b6c4c45ce1ba730f8d73db
./data/5c/90/5c90e72d37612609c182c5f4d95dc1078c77dcd96f6d746673394dcd4fd473d3
./data/5c/4e/5c4edf1b8517b56e81e87cafeef92d09a736e7a09a2689654a146e00e2f03f7d
./data/85/13/85138e506a29fb73099fb050372d8a379794ab57fe4bfdf141743db0de2b985c
./data/85/64/8564af06c9ab5b6b153f3f55a43d351b8813eae935516d807a391240e3efcb72
./data/85/a8/85a8e4dc10bef12373fc6179f23da8fe72890582ebde714cc882995d1d2200ed
jhpoelen commented 1 year ago

and, as expected the following image was retrieved following

preston cat hash://sha256/8d49bd24f6ba300b4de44fd218b53294f4cc0106cd9631018ef819b38345c75d\
 > bee.jpg

bee

{
  "xmpRights:WebStatement": "http://creativecommons.org/licences/by-nc-sa/3.0/",
  "dc:type": "StillImage",
  "Iptc4xmpExt:WorldRegion": "North America",
  "photoshop:Credit": "Museum of Comparative Zoology, Harvard University",
  "ac:metadataLanguage": "en",
  "Iptc4xmpExt:CountryName": "United States",
  "ac:providerLiteral": "Museum of Comparative Zoology, Harvard University",
  "dc:format": "image/jpeg",
  "ac:associatedSpecimenReference": "http://mczbase.mcz.harvard.edu/guid/MCZ:Ent:17219",
  "dc:rights": "http://creativecommons.org/licences/by-nc-sa/3.0/legalcode",
  "xmpRights:UsageTerms": "Available under Creative Commons Attribution Share Alike Non Commerical (CC-BY-NC-SA 3.0) license",
  "ac:serviceExpectation": "online",
  "coreid": "MCZ:Ent:17219",
  "dcterms:identifier": "http://mczbase.mcz.harvard.edu/media/1493650",
  "dcterms:description": "habitus lateral view",
  "ac:metadataProviderLiteral": "Museum of Comparative Zoology, Harvard University",
  "ac:variant": "ac:BestQuality",
  "dcterms:type": "http://purl.org/dc/dcmitype/StillImage",
  "dcterms:title": "MCZ:Ent:17219 Nomadopsis puellae habitus lateral view",
  "xmpRights:Owner": "Museum of Comparative Zoology, Harvard University",
  "ac:variantLiteral": "Best Quality",
  "dcterms:rights": "http://creativecommons.org/licences/by-nc-sa/3.0/legalcode",
  "Iptc4xmpExt:ProvinceState": "California",
  "ac:accessURI": "http://mczbase.mcz.harvard.edu/specimen_images/entomology/large/MCZ-ENT00017219_Spinoliella_puellae_hal.jpg",
  "dcterms:modified": "2020-03-13",
  "dwc:scientificName": "Nomadopsis puellae"
}