CatalogueOfLife / checklistbank

UI for checklistbank.org
https://www.checklistbank.org/
7 stars 2 forks source link

citation of checklist data sets #1094

Open myrmoteras opened 2 years ago

myrmoteras commented 2 years ago

might it be possible to cite checklists the following way:

Instead of

  1. Citation: The citation of the dataset sounds confusing to me. It says for example: Citation: Balke, M., Panjaitan, R., Surbakti, S., Shaverdo, H., Hendrich, L., Van Dam, M. H., & Lam, A. (2022). NextRAD phylogenomics, sanger sequencing and morphological data to establish three new species of New Guinea stream beetles (Version 1661532893770). Plazi.org taxonomic treatments database. https://doi.org/10.15468/zyh228

this way:

Citation: Balke, M., Panjaitan, R., Surbakti, S., Shaverdo, H., Hendrich, L., Van Dam, M. H., & Lam, A. (2022). Dataset from: NextRAD phylogenomics, sanger sequencing and morphological data to establish three new species of New Guinea stream beetles. Alpine Entomology 6: 51-64, DOI: http://dx.doi.org/10.3897/alpento.6.86665. (Version 1661532893770) .Dataset mediated by Plazi.org taxonomic treatments database. Dataset DOI: https://doi.org/10.15468/zyh228

mdoering commented 2 years ago

The citation is generated based on CSL styles from the metadata supplied. We have picked APA as the default style, but you can generate any other citation style from our metadata as you want.

What we cannot do is alter the citation in a custom way - we dont treat it as a string. The mediated by Plazi for example is not part of any normal citation style and I therefore cant see a way to cite that. Instead you are listed as the publisher:

CSL JSON: https://api.checklistbank.org/dataset/129880.csljs

Derived from the full metadata https://api.checklistbank.org/dataset/129880.json

Other CSL Styles you could pick ( we did consider to allow users to chose their preferred style): https://www.zotero.org/styles?q=apa

myrmoteras commented 2 years ago

Lyubo might help here: What is needed is a form of the citation that when picked up by Crossref allows them to extract the embedded DOIs. For that reason the two are needed. A publisher would like to see his publication cited, at the same time the dataset need be cited. The rest seems to be more "cosmetics" and fit for use for a particular user.

mdoering commented 2 years ago

Not sure if I understand "picked up by Crossref". When we publish DOIs we include structured Datacite metadata, so no parsing is needed and no citation string is ever passed around. That citation string is only for human eyes really.

myrmoteras commented 2 years ago

Sorry for my confuse writing. What the point Lyubo makes is that the citation should include both the dataset DOI as well as the source DOI of the article, of course also in the JSON.

If I understand the citation properly, this should be included in a publication that uses this checklist dataset. May be we should once try to write down how this works.

Lyubo's concern is, that he as publisher can understand where his works are being cited.

mdoering commented 2 years ago

I cant see how you can cite both DOIs in a classic citation. Where would you put that in BibTex/RIS/Endnote/CSLJson? Isn't that the regular bibliography section at the end where you cite sources in a publication? We do track the source article(s) as a proper citation with DOI as part of the metadata (see source here) and would include that in the DataCite metadata as a relation for any DOI that CLB issues. That would be the case if such a dataset gets included in the COL checklist. For example https://doi.org/10.48580/dfp3-388 is the DOI for SF Orthoptera (SFO) as part of the COL checklist. Thats selected and edited content which is different from the original source. Here is the DataCite metadata for the Orthoptera DOI: https://api.datacite.org/dois/10.48580/dfp3-388

The partOf relation lists the COL Checklist releases that this version is included in. If SFO had a dataset DOI this would be included with the IsDerivedFrom relation. I suppose we could also cite the article DOI here as the derivedFrom source? That's sth we can surely do, so the article ends up in the citation graph.

The dataset DOI given in the example above is from GBIF, which maybe is not the best DOI to (re)use? Is there a Plazi one we should be using for the dataset? We can easily get into a proliferation of DOIs for datasets. Would it not be better to assign a DOI where the dataset was minted?