ESIPFed / science-on-schema.org

science-on-schema.org - providing guidance for publishing schema.org as JSON-LD for the sciences
Apache License 2.0
113 stars 32 forks source link

Use of "citation" #42

Open mathiasbockwoldt opened 4 years ago

mathiasbockwoldt commented 4 years ago

The field "citation" in the examples is not necessarily used in the same way as intended by schema.org. It seems that in the examples, it is used as a string how to cite the given dataset. However, on schema.org, it is rather defined as a reference to other creativeWorks. There might be a new field called e.g. "citeAs" that contains the information. The explanation for this field should suggest using some identifier like a DOI. The string (as also used in the examples here) is not very useful, since different journals have different ways to cite datasets (is it e.g. "J. Smith" or "Smith, J." or "Joe Smith" or "Smith, Joe"?). I/we (Polar Data Forum III) suggest to provide a DOI if available, refer to an object with author, title, etc (maybe already given in another part of the metadata), or, as a last resort, give a citation string.

ashepherd commented 4 years ago

Leyla Garcia from bioschemas.org is reaching out to contacts at schema-org about the definition change

ashepherd commented 4 years ago

see: https://github.com/schemaorg/schemaorg/issues/2325

ashepherd commented 4 years ago

https://developers.google.com/search/docs/data-types/dataset used to define this field as "Preferred citation for this dataset", but it has been updated to say, "Identifies academic articles that are recommended by the data provider be cited in addition to the dataset itself. Provide the citation for the dataset itself with other properties, such as name, identifier, creator, and publisher properties."

I think it's safe to begin a pull request to update the guidance document on how to properly use this field.

ashepherd commented 4 years ago

Following https://github.com/schemaorg/schemaorg/issues/1031

smrgeoinfo commented 4 years ago

https://github.com/schemaorg/schemaorg/issues/1031 doesn't seem to resolve the question of what schema:citation is supposed to mean. The so:citation scope note "A citation or reference to another creative work, such as another publication, web page, scholarly article, etc." is not very useful, essentially 'a citation is a citation'.
The comment by ljgarcia. Presents two options:

And this discussion notes a third potential interpretation:

SOSO should pick one and recommend that. I suggest using "so:citation: a reference to a resource made because the CreativeWork makes a claim based on data contained in that resource" (modified from @ljgarcia second option. ) Given that the issue about this in schema.org issue tracker from 2016 has gone nowhere, I don't think we should count on any updates to schema.org.

mbjones commented 3 years ago

@smrgeoinfo thanks for the clarifications, I agree they are important. For some more context, in EML, we have fields for all three of those concepts. They are:

I think it would be good to differentiate at least these three roles of citation references in the so:citation clarifications.

ashepherd commented 3 years ago

Difficulty: Easy

positives

negatives

+1 to include in v1.3

ashepherd commented 3 years ago

As a reference, it looks like Datacite is taking the more specific properties from its schema (References and Cites) and aggregates them into this so:citation property. Not sure if there are other rules applied so maybe @mfenner could describe what their algorithm is?

ashepherd commented 3 years ago

@mbjones will reach out to Martin Fenner at DataCite about their algorithm

ashepherd commented 3 years ago

Discussed possibly the ESIP schema.org cluster managing a vocabulary of dataset relations (mirroring DataCite Schema relation types)

ashepherd commented 3 years ago

re: Garza mention of how Datacite uses schema:citation

https://github.com/ESIPFed/science-on-schema.org/issues/128#issuecomment-888458367

smrgeoinfo commented 3 years ago

We had discussed possibly using LinkRole to specify relationship of the object of the citation. here's an example, using the DataCite relationship terms in the linkRelationships text value. 'roleName' value is text or URL, so if there is a URI for the relationship that could go there.

{
 "@context": "https://schema.org/",
 "@type": "Dataset",
  "citation": [{
    "@type":"CreativeWork",
    "url": {
    "@type":"LinkRole",
        "url":"https://www.example.com/articlethatUsesDataset",
        "description":"link to publication that bases scientific conclusions on analysis using the dataset", 
        "roleName":"https://eml.ecoinformatics.org/whats-new-in-eml-2-2-0.html#usage-citations"
        "linkRelationship":"IsCitedBy"
        },

        {
    "@type":"CreativeWork",
    "url": {
        "@type":"LinkRole",
        "url":"https://www.example.com/articlethatCommentsOndataset",
        "description":"link to a publication that comments on/discusses the dataset",
        "roleName":"https://eml.ecoinformatics.org/whats-new-in-eml-2-2-0.html#referencePublication"
        "linkRelationship":"IsReferencedBy"
        },

        {
    "@type":"CreativeWork",
    "url": {
        "@type":"LinkRole",
        "url":"https://www.example.com/articlethatProvidesSupplementalInformation",
        "description":"link to a publication that provides additional information useful to understand the dataset, e.g. analytical procedures, scientific context.",
        "roleName":"https://...",
        "linkRelationship":"Supplements"
        }
]       
}

Still doesn't solve how to assert a 'recommended citation' text string to use when citing the dataset; perhaps a convention that if schema:citation has a text value (not a CreativeWork) then that is assumed to be the recommended citation string.

elishawc commented 2 years ago

Should it be "IsReferencedBy" (w/ a "D" at the end of reference)? Source: Scholix (appendix 3.1 - https://zenodo.org/record/1120265)

On Mon, Sep 20, 2021 at 1:17 PM Stephen Richard @.***> wrote:

We had discussed possibly using LinkRole to specify relationship of the object of the citation. here's an example

{ @.": "https://schema.org/", @.": "Dataset", "citation": [{ @.":"CreativeWork", "url": { @.":"LinkRole", "url":"https://www.example.com/articlethatUsesDataset", "description":"link to publication that bases scientific conclusions on analysis using the dataset", "roleName":"https://eml.ecoinformatics.org/whats-new-in-eml-2-2-0.html#usage-citations" "linkRelationship":"IsCitedBy" },

  {
***@***.***":"CreativeWork",

"url": { @.***":"LinkRole", "url":"https://www.example.com/articlethatCommentsOndataset", "description":"link to a publication that comments on/discusses the dataset", "roleName":"https://eml.ecoinformatics.org/whats-new-in-eml-2-2-0.html#referencePublication" "linkRelationship":"IsReferenceBy" },

  {
***@***.***":"CreativeWork",

"url": { @.***":"LinkRole", "url":"https://www.example.com/articlethatProvidesSupplementalInformation", "description":"link to a publication that provides additional information useful to understand the dataset, e.g. analytical procedures, scientific context.", "roleName":"https://...", "linkRelationship":"Supplements" } ]
}

Still doesn't solve how to assert a 'recommended citation' text string to use when citing the dataset; perhaps a convention that if schema:citation has a text value (not a CreativeWork) then that is assumed to be the recommended citation string.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ESIPFed/science-on-schema.org/issues/42#issuecomment-923263744, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADFUG4RHJ3235USTMU5YYTDUC6JERANCNFSM4JPTN3BA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

-- Elisha M Wood-Charlson, PhD (she/her) KBase https://kbase.us/ User Engagement Lead; @DOEKBase https://twitter.com/doekbase NMDC http://microbiomedata.org/ @microbiomedata https://twitter.com/MicrobiomeData Lawrence Berkeley National Laboratory LinkedIn http://www.linkedin.com/in/elishawc, Twitter https://twitter.com/ElishaMariePhD (personal)

ptsefton commented 1 year ago

Did this get resolved? I am trying to do the same thing in RO-Crate - that is, provide a textual citation for a dataset

ptsefton commented 1 year ago

I have suggested so:creditText for textual citations over at the RO-Crate repo: https://github.com/ResearchObject/ro-crate/issues/265

smrgeoinfo commented 1 year ago

That's in their "new" area, and it looks like what we need. Good suggestion!