Discuss future of `dataset` support with stakeholders

sdruskat commented 3 years ago

Alejandra Gonzalez-Beltran has got in touch with me to discuss the experimental dataset support and its relation with other (standard) formats and initiatives (W3C DCAT vocabulary, DataCite, RO-crate).

We should initiate this discussion and perhaps bring Arfon into it as well.

jspaaks commented 11 months ago

Do we have / can we get an overview of which proportion of CITATION.cff files use type: dataset?

sdruskat commented 11 months ago

Do we have / can we get an overview of which proportion of CITATION.cff files use type: dataset?

Yes, I can run an analysis over the corpus I'm harvesting. Will take a bit of time.

sdruskat commented 11 months ago

Didn't take that long:

In the data for 15977 different repositories, I found

327 instances where the CITATION.cff file included a line that starts with the string type: and has the string dataset in the same line for any version of the file I have on record.
321 instances where the CITATION.cff file included a line that starts with the string type: dataset for any version of the file I have on record.

Other files have no type information, or type information different than dataset, or are not UTF-8 decodable.

So given this simple metric, the ratio of repositories with a dataset-type CFF file at any given time is around .02.

citation-file-format / governance

Discuss future of `dataset` support with stakeholders #4