elifesciences / schematron-wiki

This contains the markdown from gitbook for schematron.
MIT License
2 stars 0 forks source link

Create Data references page #71

Closed naushinthomson closed 3 years ago

naushinthomson commented 4 years ago

Definition of done

JGilbert-eLife commented 3 years ago

@Melissa37 @FAtherden-eLife @bcollins14 @naushinthomson

First page for you to review in 2021!

https://app.gitbook.com/@elifesciences/s/productionhowto/article-details/content/references/data-references

Please bear in mind that this is data references in the main text, so it might repeat stuff from the DAS. We will also eventually be moving a lot of the DAS stuff to this page when we move the statement to the main text.

I didn't have any examples to hand for data references with just a web reference as this isn't something we've encouraged. If you have any, please let me know!

Thanks!

Melissa37 commented 3 years ago

In the table there is a mix of capitalization for Accession

Melissa37 commented 3 years ago

In the table it mentions Website (in Accession or DOI entry) but in the line for website it says URL - would be better to sue the same term in both instances to avoid any potential confusion

Melissa37 commented 3 years ago

Non-mandatory fields are required if they exist.

Could this be unbolded? I did a double take, thinking the terms underneath in bold were linked to that statement

Screenshot 2021-01-06 at 09 57 22

Melissa37 commented 3 years ago

It's a bit confusing - is a URL required if there is an accession number? It does not read clearly that a URL is NOT required or desired if there is a DOI but IS if there is an accession - which I think is the case?

Accession: A unique identifier for the dataset. Usually an alphanumeric string e.g. GSE48760, EMD-22286, MSV000086293 etc. Must be accompanied by a URL for the dataset, which may or may not contain the accession number as well.

Yet elsewhere URL: Yes if neither DOI nor accession is present and the schematron messages indicate you don't need a URL and an accession number

and then there is this:

err-elem-cit-data-14-1 Error: If the pub-id is of any pub-id-type except doi, it must have an @xlink:href. Reference 'XXXXXX' has a <pub-id element with type 'XXXXXX' but no @xlink-href. Action: This error indicates a URL has not been provided for a dataset with an accession number. Please locate the URL for the dataset using the database name and accession/identifier, or query the author for the missing information if this is not possible. Please provide the direct URL for this dataset.

I am a bit confused!!

Melissa37 commented 3 years ago

The Mass Spectrometry Interactive Virtual Environment (MassIVE), however, only clearly shows the title for the article. Author details are limited to the citation associated with the data and the primary contact (effectively corresponding author) for the dataset. Since the authors for the dataset may differ from those on the associated publication and the contact is unlikely to be solely responsible for collecting the data, an author query may be required to seek the full author list for the reference.

What is our success rate in authors providing full auto lists for datasets like this? I am curious because if they are re-suing datasets do they know the authors and can get those details or are they as in the dark as us?

Melissa37 commented 3 years ago

Is it worth adding a line here about datasets not yet available? Or is this totally irrelevant until we change the process and stop putting data references in the DAS?

In which case we'd need to remember to update this page when we do that :-)

fred-atherden commented 3 years ago

Would it be worth specifying at what stages pre- and final- messages fire, or is that implied/common knowledge now?

Melissa37 commented 3 years ago

Action: This warning indicates that more than one element (database name) is present in a conference reference. The extra elements should be removed — however please check whether the contents should be moved to the dataset title or the database name fields first. If possible locate the dataset online to check for the correct details. final-err-elem-cit-data-11-2 Error: Data reference 'XXXXXX' has XXXXXX source elements. It must contain one (and only one). Action: This error indicates that more than one element (database name) is present in a conference reference. The extra elements should be removed — however please check whether the contents should be moved to the dataset title or the database name fields first. If possible locate the dataset online to check for the correct details.

I think this is a copy paste issue and conference should be changed to data

Melissa37 commented 3 years ago

err-elem-cit-data-18

typo: deleteing should be deleting

fred-atherden commented 3 years ago

Should we be getting rid of assigning authority for these now (before we move data references from the DAS). What do they add? They aren't gathered by crossref (since I presume Graham's tools will only be pulling those with the specific-use attribute out, and in any case, it no longer relies on it assigning-authority), and they aren't display on our site, PubMed etc. - only in the PDF (which is typically just a duplication of the database name or potentially a source of confusion).

JGilbert-eLife commented 3 years ago

Non-mandatory fields are required if they exist.

Could this be unbolded? I did a double take, thinking the terms underneath in bold were linked to that statement

Sure - I've made this consistently roman on other pages.

naushinthomson commented 3 years ago

Re: Melissa's comment about the accession vs website field requirements, I agree the description of accession seems to say that you need both accession and website whereas the table seems to suggest one or the other is okay.

I'm also confused by the wording of:

Website: A URL is permitted as an alternative to an accession or DOI, if the latter are not available (note that this is a URL tagged separately from an accession number, rather than the URL for an accession number mentioned above).

I know we covered this in today's meeting but is there any way to make the wording clearer?

naushinthomson commented 3 years ago

Is it possible to clarify the difference between Publisher and Authority? That always trips me up in Kriya!

naushinthomson commented 3 years ago

I think that's all from me otherwise! :)

fred-atherden commented 3 years ago

Just checked and in it's current state the Schematron (final-err-elem-cit-data-13-1) will fire if there isn't a pub-id, even if there is an ext-link.

Obviously in the future, or now, we want to permit just an ext-link for general database references. Would Kriya parse it out correctly if there was just a Website field? Or would it output a pub-id with an xlink:href but no text and no other or empty attributes? If Kriya parses it fine, then happy to change it immediately for dataset refs in the ref list.

fred-atherden commented 3 years ago

I've just tested and it looks like Kriya parses it correctly in the XML :tada:

<ref id="bib89">
    <element-citation publication-type="data">
        <person-group person-group-type="author">
            <name>
                <surname>Test</surname>
                <given-names>A</given-names>
            </name>
            <name>
                <surname>Test</surname>
                <given-names>B</given-names>
            </name>
        </person-group>
        <year iso-8601-date="2020">2020</year>
        <data-title>Dataset title</data-title>
        <source>Gene Expression Omnibus</source>
        <ext-link ext-link-type="uri" xlink:href="https://www.ncbi.nlm.nih.gov/geo/"
            >https://www.ncbi.nlm.nih.gov/geo/</ext-link>
    </element-citation>
</ref>

No just checking the PDF and Continuum...

fred-atherden commented 3 years ago

As expected, PDF display is fine Screenshot 2021-01-07 at 16 45 01

fred-atherden commented 3 years ago

No problems on Continuum (not sure why the (authors) text is present, but that's a question for another day) Screenshot 2021-01-07 at 16 48 27 https://continuumtestpreview--journal.elifesciences.org/articles/54172v1#bib44 Will update the Schematron now to support either pub-id or ext-link.

bcollins14 commented 3 years ago

Nothing more from me. I have just echoed (placed eyes) on two of Naushin's messages above. Thanks!

JGilbert-eLife commented 3 years ago

Is it possible to clarify the difference between Publisher and Authority? That always trips me up in Kriya!

We're dropping assigning authority entirely, so this shouldn't matter. Publisher is the database name - I've corrected the list and table

JGilbert-eLife commented 3 years ago

OK, think I've nailed down the issue with the table.

Thanks @FAtherden-eLife - should I await revised Schematron messages?

fred-atherden commented 3 years ago

Sorry should have said, changes are already made to the following tests:

New test added:

I've just updated the spreadsheets.

JGilbert-eLife commented 3 years ago

@Melissa37 @naushinthomson @bcollins14 @FAtherden-eLife I've revised this page - are you happy with the edited version particularly re Website inclusion?

Thanks!

Melissa37 commented 3 years ago

Sounds good to me, thanks!

fred-atherden commented 3 years ago

:+1: from me - thanks

bcollins14 commented 3 years ago

Looking good, ta!

JGilbert-eLife commented 3 years ago

@FAtherden-eLife Exeter have approved this page, so the links can be added to the schematron if they're not already in there. Thanks!

fred-atherden commented 3 years ago

Thanks @JGilbert-eLife, just noting that I've removed pre-err-elem-cit-data-17-1 and final-err-elem-cit-data-17-1 from the Schematron and GitBook page, since these are duplicates of err-elem-cit-data-13-1 (after some slight tweaking).