NCEAS / eml

Ecological Metadata Language (EML)
https://eml.ecoinformatics.org/
GNU General Public License v2.0
40 stars 15 forks source link

can <usageCitation> and <referencePublication> have <citation> children? #375

Open atn38 opened 3 years ago

atn38 commented 3 years ago

The documentation on EML 2.2.0 says this about the new element:

EML 2.2.0 includes a new <literatureCited> element as a CitationListType that represents one or more citations. These citations can be a series of <citation> elements or a <bibtex> element with a list of citations.

My question is, can <usageCitation> and <referencePublication> contain <citation> children as well? My assumption was that they should, and could, although the documentation on them didn't explicitly say so. I produced an EML document with an element containing two children like this:

    <usageCitation>
      <citation>
        <bibtex>@article{rawlins_changing_2019,&#13;
    title = {Changing characteristics of runoff and freshwater export from watersheds draining northern {Alaska}},&#13;
    volume = {13},&#13;
    issn = {1994-0416},&#13;
    url = {https://www.the-cryosphere.net/13/3337/2019/tc-13-3337-2019.html},&#13;
    doi = {10.5194/tc-13-3337-2019},&#13;
    language = {English},&#13;
    number = {12},&#13;
    urldate = {2019-12-19},&#13;
    journal = {The Cryosphere},&#13;
    author = {Rawlins, Michael A. and Cai, Lei and Stuefer, Svetlana L. and Nicolsky, Dmitry},&#13;
    month = dec,&#13;
    year = {2019},&#13;
    note = {https://doi.org/10.6073/pasta/2c1cd0c1d3f0f257d9dac659314bc777},&#13;
    keywords = {LTER-BLE, LTER-Funded},&#13;
    pages = {3337--3352}&#13;
}</bibtex>
      </citation>
      <citation>
        <bibtex>@article{miller_seasonal_2021,&#13;
    title = {The seasonal phases of an {Arctic} lagoon reveal the discontinuities of {pH} variability and {CO}$_{\textrm{2}}$ flux at the air–sea interface},&#13;
    volume = {18},&#13;
    issn = {1726-4170},&#13;
    url = {https://bg.copernicus.org/articles/18/1203/2021/},&#13;
    doi = {10.5194/bg-18-1203-2021},&#13;
    language = {English},&#13;
    number = {3},&#13;
    urldate = {2021-02-22},&#13;
    journal = {Biogeosciences},&#13;
    author = {Miller, Cale A. and Bonsell, Christina and McTigue, Nathan D. and Kelley, Amanda L.},&#13;
    month = feb,&#13;
    year = {2021},&#13;
    note = {https://doi.org/10.6073/pasta/3475cdbb160a9f844aa5ede627c5f6fe&#13;
https://doi.org/10.6073/pasta/ced2cedd430d430d9149b9d7f1919729&#13;
https://doi.org/10.6073/pasta/e0e71c2d59bf7b08928061f546be6a9a&#13;
https://doi.org/10.6073/pasta/9305328d0f1ed28fbb2d7cf56c686786},&#13;
    keywords = {LTER-BLE, LTER-Funded},&#13;
    pages = {1203--1221}&#13;
}</bibtex>
      </citation>
    </usageCitation>

Excuse the malformed bibtexes! The point is that this document was deemed invalid:

"Element 'citation': This element is not expected. Expected is one of ( alternateIdentifier, shortName, title, bibtex, references )."
twhiteaker commented 3 years ago

According to the schema docs, usageCitation is a CitationType, so I think you should have:

    <usageCitation>
        <bibtex>
@article{rawlins_changing_2019,
    title = {Changing characteristics of runoff and freshwater export from watersheds draining northern {Alaska}},
    volume = {13},
    issn = {1994-0416},
    url = {https://www.the-cryosphere.net/13/3337/2019/tc-13-3337-2019.html},
    doi = {10.5194/tc-13-3337-2019},
    language = {English},
    number = {12},
    urldate = {2019-12-19},
    journal = {The Cryosphere},
    author = {Rawlins, Michael A. and Cai, Lei and Stuefer, Svetlana L. and Nicolsky, Dmitry},
    month = dec,
    year = {2019},
    note = {https://doi.org/10.6073/pasta/2c1cd0c1d3f0f257d9dac659314bc777},
    keywords = {LTER-BLE, LTER-Funded},
    pages = {3337--3352}
}

@article{miller_seasonal_2021,
    title = {The seasonal phases of an {Arctic} lagoon reveal the discontinuities of {pH} variability and {CO}$_{\textrm{2}}$ flux at the air–sea interface},
    volume = {18},
    issn = {1726-4170},
    url = {https://bg.copernicus.org/articles/18/1203/2021/},
    doi = {10.5194/bg-18-1203-2021},
    language = {English},
    number = {3},
    urldate = {2021-02-22},
    journal = {Biogeosciences},
    author = {Miller, Cale A. and Bonsell, Christina and McTigue, Nathan D. and Kelley, Amanda L.},
    month = feb,
    year = {2021},
    note = {https://doi.org/10.6073/pasta/3475cdbb160a9f844aa5ede627c5f6fe
https://doi.org/10.6073/pasta/ced2cedd430d430d9149b9d7f1919729
https://doi.org/10.6073/pasta/e0e71c2d59bf7b08928061f546be6a9a
https://doi.org/10.6073/pasta/9305328d0f1ed28fbb2d7cf56c686786},
    keywords = {LTER-BLE, LTER-Funded},
    pages = {1203--1221}
}
        </bibtex>
    </usageCitation>

I suppose that also means you can only have one article cited within usageCitation, unless you use BibTeX. Ditto for referencePublication, though that element is really intended just to hold a single citation. But, DatasetType can have many usageCitation elements, so you can include many that way if you don't want to use BibTeX.

atn38 commented 3 years ago

Looking more closely, you're right, <usageCitation> and <referencePublication> are CitationType while LiteratureCited is CitationListType, which would explain the difference. I'd think that <usageCitation> should be a list though.

amoeba commented 3 years ago

I'd think that should be a list though.

I can see where you're coming from. Does @twhiteaker 's above suggestion to repeat usageCitation elements work or would you prefer to put multiple citations into a single usageCitation element? The choice to make usageCitation and literatureCited different looks purposeful to me and makes sense due to the different semantics.

atn38 commented 3 years ago

I can definitely make multiple usageCitation elements happen, so that's not an issue. My initial impression was that usageCitation would be a list, and I think others might do the same, but ultimately no need for a schema change IMO.

On Thu, Mar 25, 2021, 16:52 Bryce Mecum @.***> wrote:

I'd think that should be a list though.

I can see where you're coming from. Does @twhiteaker https://github.com/twhiteaker 's above suggestion to repeat usageCitation elements work or would you prefer to put multiple citations into a single usageCitation element? The choice to make usageCitation and literatureCited different looks purposeful to me and makes sense due to the different semantics.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/NCEAS/eml/issues/375#issuecomment-807563810, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKAZD5S26MG2S6ALU6SLBDLTFOWDBANCNFSM4ZZ5BHOQ .

mbjones commented 3 years ago

Thanks, @atn38 yeah, looking at it, it probably should have been the list type. But, given that is repeatable, that seems like unnecessary churn to me at this point, and should be sacrificed to stability. If folks want to see this changed, please do put in a comment and let's discuss. In the absence of that support, the status quo is probably workable.

mobb commented 3 years ago

@servilla and I talked about the options. Our conclusion: we only encourage the use of a single <bibtex> reference (https://www.overleaf.com/learn/latex/bibliography_management_with_bibtex#Reference_guide) in the EML <bibtex> element, but defer the use of the parent element (referencePublication, usageCitation, or literatureCited) to the user. Since The <bibtex> element is an alternative to <citation> (which holds only one entry) it would be consistent to promote that use of <bibtex> too.

Further, each of the parent elements have a particular purpose that must be considered by the EML author. <bibtex> parent elements that are designed to be lists hold multiple children, so that is the list functionality EML authors should be using.

twhiteaker commented 3 years ago

we only encourage the use of a single <bibtex> reference in the EML <bibtex> element

I'm OK with that best practice, especially given

<bibtex> parent elements that are designed to be lists hold multiple children, so that is the list functionality EML authors should be using

If that's what you want to promote, then I think the summary and description for the <bibtex> child of a CitationListType in the XSD should be updated immediately to reflect this. Currently it suggests that <bibtex> holds a list.

<xs:element name="bibtex" type="xs:string">
    <xs:annotation>
        <xs:appinfo>
            <doc:tooltip>Bibtex Citation List</doc:tooltip>
            <doc:summary>List of citations in Bibtex format.</doc:summary>
            <doc:description>The bibtex field provides a parseable list of citations formatted according to the Bibtex formatting conventions. Each citation entry is assigned a unique key that must be unique across all bibtex fields in the EML document. The citation key can be used in markdown sections of the text to refer to this citation using the pandoc-style of inline citation keys.  See the markdown element for more details. The record is delimited using curly braces. Most reference software can both import and export citations in Bibtex format, so this is a simpler representation to produce and consume than native EML citation representations.</doc:description>
        </xs:appinfo>
    </xs:annotation>
</xs:element>
mbjones commented 3 years ago

I think we should embrace that bibtex allows a list. Parsers and consumers of metadata specs should in my mind be able to handle any valid content, which in this case means bibtex containing lists. One of the intentions of adding the bibtex extension was to make it much easier to add long lists of citations from standard reference software packages without having to deal with the complexities of interleaving it in XML <citation> elements.

servilla commented 3 years ago

I understand the convenience factor of dropping a bibtex list into the element, but our concern is two-fold:

  1. It would no longer have parity with theCitatationType cardinality (if I am reading the schema correctly) and
  2. Enforcing a single citation for <referencePublication> would no longer possible by schema analysis.

It doesn't seem too onerous for a bibtex list to be constructed with multiple <bibtex> elements, but that is my opinion only. In general, it seems we are losing the semantic specificity of the CitationType structure in favor of convenience. This is also a concern I have with the <markdown> element.

gremau commented 5 months ago

Quick note on an old issue. It seems the answer to the question in this issue is "No" according to the schema. The documentation is incorrect for usageCitation though, saying that it may have <citation> or <bibtex> children (it can have only <bibtex>). See here: https://github.com/NCEAS/eml/blob/main/docs/eml-modules-resources.md#eml-literature-module---usage-citation

mbjones commented 5 months ago

Hi @gremau -- you're right that usageCitation can't contain citation per se, but rather it consists of a choice of the original elements of a citation or a bibtex field. Here's the model:

(alternateIdentifier* , shortName{0,1} , title+ , creator+ , metadataProvider* , associatedParty* , pubDate{0,1} , language{0,1} , series{0,1} , abstract{0,1} , keywordSet* , additionalInfo* , intellectualRights{0,1} , licensed* , distribution* , coverage{0,1} , annotation* , contact* , (article | book | chapter | editedBook | manuscript | report | thesis | conferenceProceedings | personalCommunication | map | generic | audioVisual | presentation)) | bibtex | (references)

As we discussed above, the usageCitation field itself has cardinality 0..many, so is repeatable. As usageCitation itself is meant to represent a single citation work, the discussion in the issue seems to conclude that the bibtex should only include a single work, even though that is not expressible in XML Schema.