tdwg / camtrap-dp

Camera Trap Data Package (Camtrap DP)
https://camtrap-dp.tdwg.org
MIT License
45 stars 5 forks source link

Should platform metadata go in sources? #27

Closed niconoe closed 2 years ago

niconoe commented 3 years ago

In GitLab by @peterdesmet on Aug 21, 2020, 3:37

How should one use internal properties such as _id or _platform_name, etc.? Just include them as such (column name or property in package.json)?

niconoe commented 3 years ago

In GitLab by @kbubnicki on Aug 28, 2020, 15:52

These properties should be automatically generated by each platform. But they are not obligatory so if somebody wants to compile her own camtrap data package without using any platform it is still possible.

niconoe commented 3 years ago

In GitLab by @peterdesmet on Aug 28, 2020, 15:54

Is there any reason for the _underscore?

niconoe commented 3 years ago

In GitLab by @kbubnicki on Aug 28, 2020, 17:14

This is often by programming convention a way to name so called private attributes. It is also recommended by frictionless guys:

https://specs.frictionlessdata.io/patterns/#private-properties

niconoe commented 3 years ago

In GitLab by @peterdesmet on Aug 29, 2020, 13:59

Great, for the _id terms in the csv files it makes sense to me (although I don't have a use case for it). For "platform" properties I am wondering 1) why not make that information public and 2) can we load this in sources: https://specs.frictionlessdata.io/data-package/#sources?

I wonder if it is allowed to extend sources with more terms, such as version.

niconoe commented 3 years ago

In GitLab by @peterdesmet on Sep 2, 2020, 15:18

changed title from {-Guidance on use of internal properties-} to {+Should platform metadata go in sources?+}

niconoe commented 3 years ago

In GitLab by @peterdesmet on Sep 2, 2020, 15:18

Updated title to reflect new scope of issue.

niconoe commented 3 years ago

In GitLab by @kbubnicki on Oct 13, 2020, 21:47

I would say no (answering the title). In sources you can have multiple items and one can be e.g. organization name etc Then to get all (source) platform's metadata you would have to "guess" (by iterating items) which item contains relevant information. Not optimal. Partly these metadata can be duplicated e.g. you can have platform url in both sources -> path and _platform_url as having this this information public can be useful for the data package end user. On the other hand e.g. _platform_version is probably mainly useful for data package catalogues/repositories and not very interesting for end-users.

That was my idea -> to have private properties with source platform basic metadata to support packages catalogues/repositories.

niconoe commented 3 years ago

In GitLab by @kbubnicki on Nov 27, 2020, 12:02

Answering the question from the title: platform metadata CAN go to sources too but there (potentially) you can also find other sources not related to a platform that PRODUCED a particular data package. To distinguish between these two types of "sources" I would keep all _platform* (private) properties as they are now; so I would say nothing to do with this anymore?

niconoe commented 3 years ago

In GitLab by @peterdesmet on Nov 27, 2020, 12:10

And you rather keep these as internal properties for now?

niconoe commented 3 years ago

In GitLab by @kbubnicki on Nov 27, 2020, 12:20

Yes, but with the suggestion for sources to include a platform url as well i.e. together with other data sources. As said above these internal properties are related to a platform that produced a data package.

peterdesmet commented 2 years ago

I would like to open this up for discussion. I'd rather reuse sources than have the custom property platform. We could add the restriction that a Camtrap DP can only have one source + add custom properties.

PietrH commented 2 years ago

Since rightsHolder is not a valid role of contributor, I would be tempted to map any license holders under sources, in which case I could need more than one.

I'm driven to this by the documentation:

use of the “author” property does not imply that that person was the original creator of the data in the data package - merely that they created and/or maintain the data package. It is common for data packages to “package” up data from elsewhere. The original origin of the data can be indicated with the sources property

Ideally I'd like to place license holders under contributor, perhaps with role contributor, but that does seem a bit vague.

Currently, I don't see a clear solution if you are packaging from more than one rightsHolder (eg both institutional and privately owned machine observations during a bioblitz)

Maybe the solution is to split these sets up in seperate datapackages?

peterdesmet commented 2 years ago

@PietrH does your comment relate to issue #202?

PietrH commented 2 years ago

It does, I'm looking for a place to encode the rightsHolder, either in sources, or in contributors

peterdesmet commented 2 years ago

Currently it's a separate term rightsHolder at package level: https://tdwg.github.io/camtrap-dp/metadata/#rightsHolder

peterdesmet commented 2 years ago

Note also the camtraptor R package where some functionality around Camtrap DP is already available, including the option to convert to Darwin Core and EML: https://inbo.github.io/camtraptor/reference/index.html#publish-data

ben-norton commented 2 years ago

Also, a rightsHolder may be an organization of government entity. I suggest not mixing the term with those associated with a person.

PietrH commented 2 years ago

That's a great point Ben, with that in mind rightsHolders might not belong under contributor at all

peterdesmet commented 2 years ago

We're all commenting on the wrong issue 😄 but never mind.

What I want: list the rights holder (organization) as a contributor with the role rights holder. This is similar to how authors are listed in R packages (example)

@ben-norton I suggest not mixing the term with those associated with a person.

Note that contributors allows to list both organizations and persons (i.e. "name/title of the contributor (name for person, name/title of organization"), but currently doesn't allow a role rights holder (requested in https://github.com/frictionlessdata/specs/issues/805).

Current situation: provide rights holder in custom Camtrap DP term rightsHolder.

peterdesmet commented 2 years ago

Discussed with @kbubnicki: