ropenscilabs / deposits

R Client for access to multiple data repository services
https://docs.ropensci.org/deposits/
Other
37 stars 3 forks source link

Auto-populate DCMI hierarchical relationship fields #71

Open mpadge opened 1 year ago

mpadge commented 1 year ago

To enable a workflow that automatically relates multiple deposits. Something like:

cli <- depositsClient$new()
cli$deposit_new()
id1 <- cli$id

Now consider that that deposit is one output of some multi-part project, and make a new output for that main project of which id1 is a part, with corresponding DOI 10.1234/foo:

metadata <- list (
  # ... other metadata plus ...
  hasPart = list (list (relation = "hasPart", identifier = "10.1234/foo"))
)
cli <- depositsClient$new(metadata = metadata)
cli$deposit_new()
id2 <- cli$id

Creation of that new deposit at a higher hierarchical level should then automatically populate the "isPartOf" field of the initial deposit "id1" with the DOI (or equivalent) of "id2", and update the "datapackage.json" file.

The obverse procedure should happen when "isPartOf" is specified, so the deposit pointed to by "isPartOf" should also be updated to have "hasPart" specified and pointing to that deposit. (And of course that form of updating should only be triggered if the two deposits have the same "owner" (Zenodo) or "account_id" (Figshare).)

mpadge commented 1 year ago

@noamross I'd appreciate any thoughts you might have here before proceeding. Thanks!

noamross commented 1 year ago

A couple of thoughts: 1) Should this relation have a special workflow different from setting other relations (such as isIdenticalTo, isSupplementOf)? Or should those be part of this workflow? 2) I suspect there are a lot of edge cases in terms of actual relationships, authorship, ownership, etc. I think perhaps this should trigger a message in the publishing workflow, like "this deposit has a relation to another data set .... do you want to attempt to automatically add a reverse relation to that set?"

mpadge commented 1 year ago
1. Should this relation have a special workflow different from setting other relations
  (such as `isIdenticalTo`,  `isSupplementOf`)?  Or should those be part of this workflow?

No, the idea is that they can all be part of same general workflow (importantly also including "isVersionOf").

2. I suspect there are a lot of edge cases in terms of actual relationships, authorship, ownership, etc. 
   I think perhaps this should trigger a message in the publishing workflow, like "this deposit has a 
   relation to another data set .... do you want to attempt to automatically add a reverse relation to
   that set?"

Good idea! Do you then think it's worthwhile going ahead an implementing this?

noamross commented 1 year ago

I would table this as a feature as a nice-to-have to implement after higher priority items.