Open andrewpatto opened 3 years ago
Hi @andrewpatto - thanks for the great use cases! Lots to unpack so let me try in order:
Need for those complex-er cases The scope of DUO is also to encourage users to, when possible, choose simpler use conditions. Is there any chance the data owners would consider simpler limitations?
Geo restriction We have a ticket openabout lack of use cases for this and possibly deprecating the term. It does sound like you have a need for this though, so linking here for further discussions.
Multiple sets of DUO term per data sets We haven't had that use case yet, which means there is nothing for or against in the current model. Which makes it really interesting :) At the moment, you would be able to do this, but there is no structure to support the creation of 2 distinct sets of permissions, which touches more on the implementation than the specification. Which brings me to
Structure for DUO terms We have started work with @mbaudis and the schemablock group to define a structure for sets of DUO terms, and their modifiers. https://schemablocks.org/schemas/ga4gh/DataUseLimitation.html https://schemablocks.org/schemas/ga4gh/DataUseModifier.html It is on my todo list to get those finalised (with apologies to @mbaudis for the delay...) This could be the perfect use case to test them, and possibly add a "DataUseSet" to capture the sets you describe above.
It does sound like there is a bit of modeling work to be done around
Thanks Melanie. I don't think there is much appetite for simpler use conditions - if anything I am thinking our statements will tend towards even more complexity (nested levels of use condition - alliance > institute > patient - where each level is a potentially complex set of conditions as indicated via consent form or policy). We are operating in a mixed environment where samples are moving from clinical care into research dynamically over time (so not a defined research cohort with a single DUO statement).
Definitely agree with your last two modelling points - would be good to make sure sets of permissions is possible in any concrete implementation model (unless we agree that this level of complexity is not a good use case - which I would understand). And if we do agree about sets - we should also agree about algorithm interpretation (I would think they have to be ORed together - but would be great to write that down formally)
pinging @solideoglori who is just back from leave. Jonathan leads implementation efforts on the DUOS side and it would be good to get input as to whether they have (1) considered multiple sets (2) thought about the disjunction interpretation.
also pinging @mbaudis who is leading on schema blocks. @mbaudis: @andrewpatto is looking for a structure to encode different sets of DUO permissions on a dataset
Thinking about some real world rules we might have over our data - we were wondering if we could express multiple permissions for a dataset varying by modifier. Is it accepted that this could be way to tag datasets with DUO or is this explicitly forbidden (and if so, how to achieve otherwise)
e.g.
we will have datasets that are available for GRU in Australia after a 12 month moratorium, but which would be available immediately for disease specific clinical care release.
Would it be valid to have:
Dataset X: GRU + MOR(12months), GS(AUS) or DS + CC
then getting further into the weeds - would it be valid to have the 'same' base permissions but with disjoint modifiers.
Dataset Y: GRU + MOR(12months), GS(not AUS), NCU or GRU + MOR(1month), GS(AUS)
(these is the kind of real world restrictions that our labs want to put on the datasets..)