RDA DMP Common Standard for machine-actionable Data Management Plans
About this document
This is a metadata application profile to provide basic interoperability between systems producing or consuming machine-actionable data management plans (maDMPS). Further fields can be added in specific deployments, but they do not guarantee interoperability. DMP tools can use any other fields in their internal data models.
This application profile is intended to cover a wide range of use cases and does not set any business (e.g. funder specific) requirements. It represents information over the whole DMP lifecycle.
For more information see examples, FAQ and useful links to consultations, documents, tools, prototypes, etc. developed by the working group.
DMP
Provides high level information about the DMP, e.g. its title, modification date, etc. It is the root of this application profile. The majority of its fields are mandatory.
Project
Describes the project associated with the DMP, if applicable. It can be used to describe any type of project: that is, not only funded projects, but also internal projects, PhD theses, etc.
Funding
For specifying details on funded projects, e.g. NSF of EC funded projects.
Contact
Specifies the party which can provide any information on the DMP. This is not necessarily the DMP creator, and can be a person or an organisation.
Contributor
For listing all parties involved in the process of the data management described by this DMP, and those parties involved in the creation and management of the DMP itself.
Cost
Provides a list of costs related to data management.
Dataset
This follows the defintion of Dataset in the W3C DCAT specification. Dataset can be understood as a logical entity depicting data, e.g. raw data. It provides high level information about the data. The granularity of dataset depends on a specific setting. In edge cases it can be a file, but also a collection of files in different formats. See FAQ for more details.
Distribution
The term "distribution" used here is as defined by the very widely used W3C DCAT metadata application profile. It is used to mean a particular instance of a dataset that has been, or is intended to be, made available in some fashion. It is important to separates the logical notion of a "dataset" from its distributions, of which there may be several, especially to attach more specific metadata properties such as "size" and "license". The lifecycle of the DMP has no particular bearing on this, and a "distribution" may be defined even if the DMP is never actually realised.
License
Used to indicate the license under which data (each specific Distribution) will be made available. It also allows for modelling embargoes. See FAQ for more details.
Host
Provides information on the system where data is stored. It can be used to provide details on a repository where data is deposited, e.g. a Core Trust Seal certified repository located in Europe that uses DOIs. It can also provide details on systems where data is stored and processed during research, e.g. a high performance computer that uses fast storage with two daily backups.
Security and Privacy
Used to indicate any specific requirements related to security and privacy of a specific dataset, e.g. to indicate that data is not anonymized.
Technical Resource
For specifying equipment needed/used to create or process the data, e.g. a microscope, etc.
Metadata
Provides a pointer to a metadata standard used to describe the data. It does not contain any actual metadata relating to the dataset.
| Structure |
Name | Description | Data Type | Cardinality | Example Value |
---|
contact_id | Identifier for a contact person | Nested Data Structure | 1 | |
mbox | E-mail address | String | 1 | cc@example.com |
name | Name of the contact person | String | 1 | Charlie Chaplin |
Name | Description | Data Type | Cardinality | Example Value |
---|
identifier | | String | 1 | |
type | Identifier type Allowed Values: | Term from Controlled Vocabulary | 1 | orcid |
Properties in 'contributor'
Name | Description | Data Type | Cardinality | Example Value |
---|
contributor_id | | Nested Data Structure | 1 | |
mbox | Mail address | String | 0..1 | john@smith.com |
name | Name | String | 1 | John Smith |
role | Type of contributor | String | 1..n | Data Steward |
Properties in 'contributor_id'
Name | Description | Data Type | Cardinality | Example Value |
---|
identifier | Identifier for a contact person | String | 1 | http://orcid.org/0000-0000-0000-0000 |
type | Identifier type Allowed Values: | Term from Controlled Vocabulary | 1 | orcid |
Properties in 'cost'
Name | Description | Data Type | Cardinality | Example Value |
---|
currency_code | Allowed values defined by ISO 4217. | Term from Controlled Vocabulary | 0..1 | EUR |
description | Description | String | 0..1 | Costs for maintaining.... |
title | Title | String | 1 | Storage and backup |
value | Value | Number | 0..1 | 1000 |
Properties in 'dataset'
Name | Description | Data Type | Cardinality | Example Value |
---|
data_quality_assurance | Data Quality Assurance | String | 0..n | We use file naming convention... |
dataset_id | Dataset ID | Nested Data Structure | 1 | |
description | Description is a property in both Dataset and Distribution, in compliance with W3C DCAT. In some cases these might be identical, but in most cases the Dataset represents a more abstract concept, while the distribution can point to a specific file. | String | 0..1 | Field observation |
distribution | To provide technical information on a specific instance of data. | Nested Data Structure | 0..n | |
issued | Issued. Encoded using the relevant ISO 8601 Date and Time compliant string | Date | 0..1 | 2019-06-30 |
keyword | Keyword | String | 0..n | keyword 1, keyword 2 |
language | Language of the dataset expressed using ISO 639-3 | Term from Controlled Vocabulary | 0..1 | eng |
metadata | To describe metadata standards used. | Nested Data Structure | 0..n | |
personal_data | Allowed Values: | Term from Controlled Vocabulary | 1 | unknown |
preservation_statement | Preservation Statement | String | 0..1 | Must be preserved to enable... |
security_and_privacy | To list all issues and requirements related to security and privacy | Nested Data Structure | 0..n | |
sensitive_data | Allowed Values: | Term from Controlled Vocabulary | 1 | unknown |
technical_resource | To list all technical resources needed to implement a DMP | Nested Data Structure | 0..n | |
title | Title is a property in both Dataset and Distribution, in compliance with W3C DCAT. In some cases these might be identical, but in most cases the Dataset represents a more abstract concept, while the distribution can point to a specific file. | String | 1 | Fast car images |
type | If appropriate, type according to: DataCite and/or COAR dictionary. Otherwise use the common name for the type, e.g. raw data, software, survey, etc. https://schema.datacite.org/meta/kernel-4.1/doc/DataCite-MetadataKernel_v4.1.pdf
http://vocabularies.coar-repositories.org/pubby/resource_type.html | String | 0..1 | image |
Properties in 'dataset_id'
Name | Description | Data Type | Cardinality | Example Value |
---|
identifier | Identifier for a dataset | String | 1 | https://hdl.handle.net/11353/10.923628 |
type | Identifier type Allowed Values: | Term from Controlled Vocabulary | 1 | handle |
Properties in 'distribution'
Name | Description | Data Type | Cardinality | Example Value |
---|
access_url | A URL of the resource that gives access to a distribution of the dataset. e.g. landing page. | URI | 0..1 | http://some.repo... |
available_until | Indicates how long this distribution will be/ should be available. Encoded using the relevant ISO 8601 Date and Time compliant string | Date | 0..1 | 2030-06-30 |
byte_size | Byte Size | Number | 0..1 | 690000 |
data_access | Indicates access mode for data. Allowed Values: | Term from Controlled Vocabulary | 1 | open |
description | Description is a property in both Dataset and Distribution, in compliance with W3C DCAT. In some cases these might be identical, but in most cases the Dataset represents a more abstract concept, while the distribution can point to a specific file. | String | 0..1 | Best quality data before resizing |
download_url | The URL of the downloadable file in a given format. E.g. CSV file or RDF file. | URI | 0..1 | http://some.repo.../download/... |
format | Format according to: https://www.iana.org/assignments/media-types/media-types.xhtml if appropriate, otherwise use the common name for this format | String | 0..n | image/tiff |
host | To provide information on quality of service provided by infrastructure (e.g. repository) where data is stored | Nested Data Structure | 0..1 | |
license | To list all licenses applied to a specific distribution of data. | Nested Data Structure | 0..n | |
title | Title is a property in both Dataset and Distribution, in compliance with W3C DCAT. In some cases these might be identical, but in most cases the Dataset represents a more abstract concept, while the distribution can point to a specific file. | String | 1 | Full resolution images |
Properties in 'dmp'
Name | Description | Data Type | Cardinality | Example Value |
---|
contact | Contact person for a DMP | Nested Data Structure | 1 | |
contributor | To list people that play role in data management related to this DMP, e.g. resoponsible for performing actions described in this DMP. | Nested Data Structure | 0..n | |
cost | To list costs related to data management. Providing multiple instances of a 'Cost' allows to break down costs into details. Providing one 'Cost' instance allows to provide one aggregated sum. | Nested Data Structure | 0..n | |
created | Date and time of the first version of a DMP. Must not be changed in subsequent DMPs. Encoded using the relevant ISO 8601 Date and Time compliant string | DateTime | 1 | 2019-03-13T13:13:00 |
dataset | To describe data on a non-technical level. | Nested Data Structure | 1..n | |
description | To provide any free-form text information on a DMP | String | 0..1 | This DMP is for our new project |
dmp_id | Identifier for the DMP itself | Nested Data Structure | 1 | |
ethical_issues_description | To describe ethical issues directly in a DMP | String | 0..1 | There are ethical issues, because... |
ethical_issues_exist | To indicate whether there are ethical issues related to data that this DMP describes. Allowed Values: | Term from Controlled Vocabulary | 1 | yes |
ethical_issues_report | To indicate where a protocol from a meeting with an ethical commitee can be found | URI | 0..1 | http://report.location |
language | Language of the DMP expressed using ISO 639-3 | Term from Controlled Vocabulary | 1 | eng |
modified | Must be set each time DMP is modified. Indicates DMP version. Encoded using the relevant ISO 8601 Date and Time compliant string | DateTime | 1 | 2020-03-14T10:53:49 |
project | Project related to a DMP | Nested Data Structure | 0..n | |
title | Title of a DMP | String | 1 | DMP for our new project |
Properties in 'dmp_id'
Name | Description | Data Type | Cardinality | Example Value |
---|
identifier | Identifier for a DMP | String | 1 | https://doi.org/10.1371/journal.pcbi.1006750 |
type | Identifier type Allowed Values: | Term from Controlled Vocabulary | 1 | doi |
Properties in 'funder_id'
Name | Description | Data Type | Cardinality | Example Value |
---|
identifier | Funder ID, recommended to use CrossRef Funder Registry. See: https://www.crossref.org/services/funder-registry/ | String | 1 | 501100002428 |
type | Identifier type Allowed Values: | Term from Controlled Vocabulary | 1 | fundref |
Properties in 'funding'
Name | Description | Data Type | Cardinality | Example Value |
---|
funder_id | Funder ID of the associated project | Nested Data Structure | 1 | |
funding_status | To express different phases of project lifecycle. Allowed Values:- planned
- applied
- granted
- rejected
| Term from Controlled Vocabulary | 0..1 | granted |
grant_id | Grant ID of the associated project | Nested Data Structure | 0..1 | 1234567 |
Properties in 'grant_id'
Name | Description | Data Type | Cardinality | Example Value |
---|
identifier | Grant ID | String | 1 | 776242 |
type | Identifier type Allowed Values: | Term from Controlled Vocabulary | 1 | other |
Properties in 'host'
Name | Description | Data Type | Cardinality | Example Value |
---|
availability | Availability | String | 0..1 | 99,5 |
backup__frequency | Backup Frequency | String | 0..1 | weekly |
backup_type | Backup Type | String | 0..1 | tapes |
certified_with | Repository certified to a recognised standard Allowed Values:- din31644
- dini-zertifikat
- dsa
- iso16363
- iso16919
- trac
- wds
- coretrustseal
| Term from Controlled Vocabulary | 0..1 | coretrustseal |
description | Description | String | 0..1 | Repository hosted by... |
geo_location | Physical location of the data expressed using ISO 3166-1 country code. | Term from Controlled Vocabulary | 0..1 | AT |
pid_system | PID System Allowed Values:- ark
- arxiv
- bibcode
- doi
- ean13
- eissn
- handle
- igsn
- isbn
- issn
- istc
- lissn
- lsid
- pmid
- purl
- upc
- url
- urn
- other
| Term from Controlled Vocabulary | 0..n | doi |
storage_type | The type of storage required | String | 0..1 | |
support_versioning | Allowed Values: | Term from Controlled Vocabulary | 0..1 | yes |
title | Title | String | 1 | Super Repository |
url | The URL of the system hosting a distribution of a dataset | URI | 1 | https://zenodo.org |
Properties in 'license'
Name | Description | Data Type | Cardinality | Example Value |
---|
license_ref | Link to license document. | URI | 1 | https://creativecommons.org/licenses/by/4.0/ |
start_date | If date is set in the future, it indicates embargo period. Encoded using the relevant ISO 8601 Date and Time compliant string | Date | 1 | 2019-06-30 |
Name | Description | Data Type | Cardinality | Example Value |
---|
description | Description | String | 0..1 | provides taxonomy for... |
language | Language of the metadata expressed using ISO 639-3 | Term from Controlled Vocabulary | 1 | eng |
metadata_standard_id | Metadata Standard ID | Nested Data Structure | 1 | |
Name | Description | Data Type | Cardinality | Example Value |
---|
identifier | Identifier for the metadata standard used. | String | 1 | http://www.dublincore.org/specifications/dublin-core/dcmi-terms/ |
type | Identifier type Allowed Values: | Term from Controlled Vocabulary | 1 | url |
Properties in 'project'
Name | Description | Data Type | Cardinality | Example Value |
---|
description | Project description | String | 0..1 | Project develops novel... |
end | Project end date. Encoded using the relevant ISO 8601 Date and Time compliant string | Date | 0..1 | 2020-03-31 |
funding | Funding related with a project | Nested Data Structure | 0..n | |
start | Project start date. Encoded using the relevant ISO 8601 Date and Time compliant string | Date | 0..1 | 2019-04-01 |
title | Project title | String | 1 | Our New Project |
Properties in 'security_and_privacy'
Name | Description | Data Type | Cardinality | Example Value |
---|
description | Description | String | 0..1 | Server with data must be kept in a locked room |
title | Title | String | 1 | Physical access control |
Properties in 'technical_resource'
Name | Description | Data Type | Cardinality | Example Value |
---|
description | Description of the technical resource | String | 0..1 | Device needed to collect field data... |
name | Name of the technical resource | String | 1 | 123/45/43/AT |
Cite as
Tomasz Miksa, Paul Walk, Peter Neish. RDA DMP Common Standard for Machine-actionable Data Management Plans. http://doi.org/10.15497/rda00039