Open ThiloteE opened 2 years ago
JabRef 5.4--2021-12-20--ab44182 Windows 10 10.0 amd64 Java 16.0.2 JavaFX 17.0.1+1
When fetching the entry via JabRef's Import by DOI dialogue, the article-number is fetched, but not via the number field, but rather it replaces the page-range within the pages field.
@Article{Jerrentrup_2018,
author = {Andreas Jerrentrup and Tobias Mueller and Ulrich Glowalla and Meike Herder and Nadine Henrichs and Andreas Neubauer and Juergen R. Schaefer},
date = {2018-03},
journaltitle = {{PLOS} {ONE}},
title = {Teaching medicine with the help of {\textquotedblleft}Dr. House{\textquotedblright}},
doi = {10.1371/journal.pone.0193972},
editor = {Thanh G Phan},
number = {3},
pages = {e0193972},
volume = {13},
publisher = {Public Library of Science ({PLoS})},
}
@ThiloteE DOI importer get bibtex back from the doi. Some publishers provide weird BibTeX data. This is known. We are not living in a perfect world where we have accurate data.
curl --location --request GET 'https://dx.doi.org/10.1371/journal.pone.0193972' \
--header 'Accept: application/x-bibtex'
powershell:
$headers = New-Object "System.Collections.Generic.Dictionary[[String],[String]]"
$headers.Add("Accept", "application/x-bibtex")
$response = Invoke-RestMethod 'https://dx.doi.org/10.1371/journal.pone.0193972' -Method 'GET' -Headers $headers
$response | ConvertTo-Json
results in
@article{Jerrentrup_2018,
doi = {10.1371/journal.pone.0193972},
url = {https://doi.org/10.1371%2Fjournal.pone.0193972},
year = 2018,
month = {mar},
publisher = {Public Library of Science ({PLoS})},
volume = {13},
number = {3},
pages = {e0193972},
author = {Andreas Jerrentrup and Tobias Mueller and Ulrich Glowalla and Meike Herder and Nadine Henrichs and Andreas Neubauer and Juergen R. Schaefer},
editor = {Thanh G Phan},
title = {Teaching medicine with the help of {\textquotedblleft}Dr. House{\textquotedblright}},
journal = {{PLOS} {ONE}}
}
moewew clarified: "Moving the issue number to issue and article number to number would not be my preference, because the issue number is traditionally number
and the article number is eid
in biblatex"
The long answer: https://github.com/plk/biblatex/issues/726#issuecomment-1010264258
Therefore:
pages
field and move those to eid
, if eid is empty.SP
denotes "Start Page" according to https://en.wikipedia.org/w/index.php?title=RIS_(file_format)&oldid=1017778965And of course add options to cleanup actions to trigger the move back and forth manually.
Some fields exist that are not present in Bibtex, but that could be fetched for Biblatex
For me, this is a huge problem with bibtex data. I usually conduct searches at the source, and then export results to capture the metadata I need. Usually, the choice is between RIS (or a variant such as Natbib), Bibtex, CSV and sometimes a plain text report (with or without field names).
Most of the formats are impoverished compared to the source data, and bibtex is particularly anemic, so I usually end up with RIS or Natbib. I start with the richest format and transform it using regex to create importable data. For example, the RN
, MH
, and OT
fields in PubMed records all import as keywords in Jabref (or in other reference managers), so I modify all the values in advance by adding designators to differentiate the merged keywords from each other.
Somehow fetch the (article-) number, move it to the number field and move the issue-number from the number field into the issue field.
Data in the 'wrong' field is inconvenient, for sure, but not as bad as missing data. Being able to aquire the data easily (even retaining nonstandard names from the source) would be a big improvement.
Also, in my experience with this specific example, providers use these nonstandard 'page numbers' for electronic articles that have (or will have) another value as their unique identifier. Moving the ePage to an identifier field can create a confusing mess when other records from the same source contain a different data type (the 'real' identifier ) in the target field. Plus, this is only one of many field-mismatch scenarios. The problem also applies to reversal of full and abbreviated journal names, original versus translated titles, and whether "supplement" resides with volume/issue/number or stands alone.
Actually, after an import from a fetcher, a conversion to biblatex/bibtex is automatically performed in #8361 this was also implemented for ID fetching (e.g. DOI)
The only thing which is probably missing is when you import/open from a file. This refs #8298
Hey, there is only a little JabRef can do and somebody would need to take an interest and start doing it.
Options would be:
Of course, the best would be option 3.
What users can do meanwhile:
Moving the ePage to an identifier field can create a confusing mess when other records from the same source contain a different data type (the 'real' identifier ) in the target field.
Can you provide an example? I fail to understand. This sentence is too complicated for me xD
Hey, there is only a little JabRef can do and somebody would need to take an interest and start doing it.
This is definitely a systemic problem, not a JabRef issue per se.
- JabRef fetches as much data as possible --> e.g. If we ASSUME RIS provides more data, JabRef should prioritize fetching from RIS.
Sounds like a lot of work for little gain. Compared to BibTeX, RIS does have the advantage of more data fields, but RIS records have limitations of their own. For instance, RIS records often use author initials when BibTeX records from the same source often include full author names. BibTeX and BibLaTeX are also far more consistent than the RIS pseudostandard.
- JabRef is Bib(La)TeX native --> JabRef continues to fetch BibTeX data
Easy conversion as supported already is a very reasonable status quo.
- Providers of Bibliographic data provide more data --> e.g. switching from BibTeX to BibLaTeX standard. … Of course, the best would be option 3.
If only.
Moving the ePage to an identifier field can create a confusing mess when other records from the same source contain a different data type (the 'real' identifier ) in the target field.
Can you provide an example? I fail to understand. This sentence is too complicated for me xD
Not too complicated; just nonsensical. I failed to notice that the example in the original post was about identifiable data that belonged in another field. This obviously a great reason to move the data.
Another idea:
Fetch both RIS and BibTeX, then convert RIS to BibTeX. Let the duplicate detection compare the two and in case they differ, let the user decide which fields and field content to keep.
Emerged in https://github.com/JabRef/jabref/issues/8372
via:
The Problem
Bibliographic data is usually provided for and formatted by major providers (i.e. Reed-Elsevier, Taylor & Francis, Wiley-Blackwell, Springer and Sage, crossref etc.) in a Bibtex conform standard. When users of JabRef try to fetch bibliographic metadata from the web, some fields exist in RIS* that are not present in Bibtex, but that could be fetched for Biblatex conform datasets.
* substitute RIS with your standard of choice
How to reproduce
https://journals.plos.org/plosone/article/citation?id=10.1371/journal.pone.0193972
RIS data provides the article-number (e0193972) and the issue (3):
Whereas Bibtex only provides number (3):
Edit: Here the mapping to avoid confusion about what relates to what:
* Some providers of bibliographic metadata put the article-number (= Biblatex
eid
) into the Bibtexpages
field or the RISSP
field, because article-numbers do not exist in these standards. This is probably because prior to the digital age, there was no need to come up with article-numbers. Page and issue-number was enough to identify an article. Nowadays webpages may not have proper page-numbers, but may still contain multiple articles.It is important to note that this was just an example.
Desired solution
When JabRef users fetch bibliographic metadata from the web, somehow fetch as many fields for the entry as possible. Take other standards apart from BibTeX/Biblatex into account.
Example A) Fetch Bibtex/Biblatex data. Fetch RIS
IS
field and move the containing data to the Bibtex/Biblatexnumber
field, if the Bibtex/Biblatexnumber
field is empty.Example B) We assume RIS provides more data than BibTex/Biblatex --> Always fetch RIS data and convert to Bibtex/Biblatex
Additional context
library > library properties > library mode
Biblatex offers more fine grained fields and fields that are not existent in Bibtex.
Conformity with Biblatex