I am inclined to duplicate datasets that have multiple owners, however, 4.4 GB of datasets have 2 owners, meaning that this method would result in at least that much space being occupied by duplicate data.
don't duplicate - datasets go to the "first owner"
first owner can mean many things. If possible, can you identify which owner uploaded the dataset? If yes, use that. If not, then just take the first owner from the list
store the additional owners ("additional_owners" property?)
keep a list for our reference of these datasets, and we'll see how we handle them in future
I am inclined to duplicate datasets that have multiple owners, however, 4.4 GB of datasets have 2 owners, meaning that this method would result in at least that much space being occupied by duplicate data.