DSpace-Labs / SAFBuilder

Builds a Simple Archive Format package from files and a spreadsheet
https://wiki.duraspace.org/display/DSPACE/Simple+Archive+Format+Packager
45 stars 35 forks source link

Validate collections are real collections #5

Open peterdietz opened 9 years ago

peterdietz commented 9 years ago

A new feature of SAFBuilder is the ability to batch import to multiple collections. (Also requires modifying DSpace ItemImport, done for Longsight code, perhaps contributable upstream in DSpace 6 -- https://github.com/LongsightGroup/DSpace/commit/c14db3702b317f1d8dd72d7735a7cccf480f2f99)

collection 123/2233||123/2334||123/3434

However, if they aren't actually a collection, this will cause a problem for the importer. So.. it would be nice if SAFBuilder could query the collections to ensure they are collections... Perhaps as a post generation check

I'm thinking it could query the handle REST API: http://hdl.handle.net/api/handles/1811/686

{
  "responseCode":1,
  "handle":"1811/686",
  "values":[{
    "index":100,
    "type":"URL",
    "data":{
      "format":"string",
      "value":"http://kb.osu.edu/dspace/handle/1811/686"
    },
    "permissions":"1010",
    "ttl":100,
    "timestamp":"1970-01-01T00:01:40Z"
  }]
}

Then from that, look up: http://kb.osu.edu/dspace/handle/1811/686, then try to guess where the DSpace REST API might be, and query that, to see if this handle is of type collection?

http://kb.osu.edu/rest/handle/1811/686

{
  "id":34,
  "name":"Ohio Journal of Science (Ohio Academy of Science)",
  "handle":"1811/686",
  "type":"community",
  ...
}

In which case, we learn that 1811/686 is a community, and SAFBuilder could complain.