ropensci / datapack

An R package to handle data packages
https://docs.ropensci.org/datapack
44 stars 9 forks source link

datapack can create resource map's with incorrect dcterm:identifier value #131

Open amoeba opened 3 years ago

amoeba commented 3 years ago

Jasmine ran into a situation where datapack created a resource map with content like this:

  <rdf:Description rdf:about="https://cn.dataone.org/cn/v2/resolve/urn%3Auuid%3Acb71b688-944f-4872-a6e4-0326b599b1ae">
    <dcterms:identifier rdf:datatype="http://www.w3.org/2001/XMLSchema#string">https://cn.dataone.org/cn/v2/resolve/urn%3Auuid%3Acb71b688-944f-4872-a6e4-0326b599b1ae</dcterms:identifier>
  </rdf:Description>

Look at the object literal, "https://cn.dataone.org/cn/v2/resolve/urn%3Auuid%3Acb71b688-944f-4872-a6e4-0326b599b1ae", which should actually be "urn:uuid:cb71b688-944f-4872-a6e4-0326b599b1ae". We discovered this because the resource map will always fail to index because there's no object with the identifier "https://cn.dataone.org/cn/v2/resolve/urn%3Auuid%3Acb71b688-944f-4872-a6e4-0326b599b1ae".

This happened when Jasmine was trying to add a data object to a pakcage. The code that was being run was roughly:

  1. dp <- getDataPackage(...)
  2. do <- getDataObject(...)
  3. addMember(dp, do, mo)

That workflow looks totally reasonable to me.

All that said: I was not able to create a minimum reproducible example using similar code. I've attached an RData file with the DataPackage object that reproduces the behavior but I can't reproduce this any other way. Seems to me the bug could be coming from some sort of string manipulation edge case but I can't really tell at this point.

datapackage.RData.zip

I thought I'd make an issue just for posterity and to see if @gothub has any thoughts or ideas.