IEA-Task-43 / digital_wra_data_standard

IEA Task 43: pre-construction energy estimate data standard repository
BSD 3-Clause "New" or "Revised" License
56 stars 15 forks source link

Including licensing information / terms of use in the metadata? #225

Closed AndyClifton closed 6 months ago

AndyClifton commented 1 year ago

When I transfer (meta)data across an organisational boundary, I would like to be able to add some metadata that tells the recipient about the conditions under which they can use it. This could be a license (either the URL for a specific license, a license name, or some text), or a terms of use, or similar.

Is there a good way to include this kind of thing in the metadata defined by this model? I saw a "notes" field for the measurement locations and sensors, but I am not seeing anything at the top level that would let me tell the recipient what the data license / terms of use are.

Do you see a need to include this kind of information in the schema? Is there a solution using the current schema?

Thanks!

stephenholleran commented 1 year ago

Hi @AndyClifton,

That is a good idea. We should probably do something like that especially for sharing open data.

I shouldn't be saying this but there is nothing stopping you from adding your own properties to this section. We deliberately did this but I can see it causing problems for people who will build a system to ingest a data model and then someone adds on a property which that system ignores. Anyway, you can add a property and it won't break the data model but it will cause problems. See example below:

image

Is this what you had in mind or would you want something else?

AndyClifton commented 1 year ago

That's a nice intermediate solution! If this license term could be added to the official model, that would be great.

Incidentally, we're starting to use the words, "data transfer" instead of "data sharing" to communicate the idea that data is being transferred to someone specific, rather than the more open implications of "sharing". Either way, adding a license term would be helpful for data movements across org boundaries.

stephenholleran commented 1 year ago

@AndyClifton Thinking about this a bit more, what other examples are there where the license is actually included within the file/object? For other files/objects I've come across the license is available from the point of download but not included.

Not saying that adding a license attribute into the WRA Data Model is not a good thing to do. Just trying to understand.

stephenholleran commented 1 year ago

P.S.

Incidentally, we're starting to use the words, "data transfer" instead of "data sharing" to communicate the idea that data is being transferred to someone specific, rather than the more open implications of "sharing".

Who is "we"?

What would you call it if the data was on a public open platform where users can download freely?

AndyClifton commented 1 year ago

@AndyClifton Thinking about this a bit more, what other examples are there where the license is actually included within the file/object? For other files/objects I've come across the license is available from the point of download but not included.

Not saying that adding a license attribute into the WRA Data Model is not a good thing to do. Just trying to understand.

The Dublin Core metadata schema is pretty standard for data sharing in academic circles. It includes a "rights management" element, so it should be in there if they've implemented it.

I think it's included in the data the Zenodo API returns, for example. It's definitely in their deposit schema: https://zenodo.org/schemas/deposits/records/legacyrecord.json

AndyClifton commented 1 year ago

P.S.

Incidentally, we're starting to use the words, "data transfer" instead of "data sharing" to communicate the idea that data is being transferred to someone specific, rather than the more open implications of "sharing".

Who is "we"?

What would you call it if the data was on a public open platform where users can download freely?

We = enviConnect and there was a discussion in Task 52 about it, too.

On a public platform it's definitely sharing because of the lack of control over the end users/ uses.

I'll share whatever I learn :)

stephenholleran commented 7 months ago

Hi @AndyClifton,

I have implemented this is a new feature branch to be pulled into the dev branch. I would appreciate if you could take a look?

Your, ironically, Zenodo link doesn't work anymore so I couldn't see what you were talking about. I have gone to the Dublin Core to base this on. They have a property 'license' which I think is exactly what you want.

The DCMI specification: https://www.dublincore.org/specifications/dublin-core/dcmi-terms/#http://purl.org/dc/terms/license

Example use: https://www.dublincore.org/resources/userguide/creating_metadata/#License

Unfortunately, license should be a sub property of rights, however we don't have a rights property and that would be too much to go into now. Maybe some IEA Task could develop a JSON Schema implementation of DCMI that we could then just use as a reference and then you would be able to use whatever you wanted from it. For now, I think license on it's own should be fine.

My suggested implementation:

"license": {
      "type": "string",
      "title": "License",
      "description": "Expected to be used as the Dublin Core Metadata Initiative intended. Link: http://purl.org/dc/terms/license Definition: A legal document giving official permission to do something with the resource. Comment: Recommended practice is to identify the license document with a URI. If this is not possible or feasible, a literal value that identifies the license may be provided.",
      "examples": [
        "https://opensource.org/license/bsd-3-clause/",
        "BSD-3-Clause"
      ]
    }

Gone with the US spelling of license as that is what https://opensource.org/licenses/ uses.

I have made it not required so it is backward compatible.

AndyClifton commented 7 months ago

Thanks! Looks like it should work.

I guess it could also work if I wanted to write "Andy shared this data with Stephen so he could test something", even though there's not an associated URL.

stephenholleran commented 7 months ago

I guess it could also work if I wanted to write "Andy shared this data with Stephen so he could test something", even though there's not an associated URL.

It could. It is a free text field so you can write whatever you want. Even Dublin Core only say "Recommended practice is to identify the license document with a URI", emphasis on "Recommended" . You could actually write the text of the legal document in there either.

So I take it this satisfies your use case?

stephenholleran commented 6 months ago

Now merged.

EDIT: Into the dev branch that is.