Open roll opened 2 months ago
@roll does DataCite even use Data Packages or Table Schema to begin with? I have skimmed their documentation and all of their examples are in XML. Also, their language support seems to describe the language of the resource, just like the current Table Schema pattern you linked to above, not the language of the metadata.
What I miss is a way to describe the metadata (resource title and description, column names and descriptions) in multiple languages, while the data itself remains in a single language.
Metadata is provided in multiple languages.
in animals.datapackage.en.yaml
:
resources:
- name: animals
path: animals.csv
title: Animals
schema:
fields:
- name: id
type: integer
- name: animal
title: Animal species name
type: string
in animals.datapackage.ru.yaml
:
resources:
- name: animals
path: animals.csv
title: Животные
schema:
fields:
- name: id
type: integer
- name: animal
title: Название вида животного (на английском языке)
type: string
The csv file (the data itself) has only one version, in English:
id,animal
1,cat
2,dog
3,giraffe
4,bat
5,leopard
6,lion
7,tiger
8,elephant
9,panda
10,rabbit
11,chicken
12,cow
13,horse
14,sheep
This undocumented pattern already works. We already use it.
The problem is, the typing information (integer, string) and other non-language specific metadata (e.g. null values, validation rules, etc.) have to be repeated in each data package metadata file. That's bad for maintenance, as types and validation rules may evolve and you have to manually keep track of those across several versions of the data package metadata file and keep them in sync. It would be great if I could define those technical metadata only once and in one place.
@augusto-herrmann Thanks a lot for writing it down! Just trying to gather all the information now
Overview
As we already have Languages recipe, and there is a de-facto standard way to support languages in DataCite, we might go forward and finally make it to the specs.
cc @augusto-herrmann