JabRef / jabref

Graphical Java application for managing BibTeX and biblatex (.bib) databases
https://devdocs.jabref.org
MIT License
3.53k stars 2.47k forks source link

ISI returns double bracketed bibtex fields #3974

Closed nilsbecker closed 3 years ago

nilsbecker commented 6 years ago

JabRef version on

JabRef 4.1 Mac OS X 10.13.4 x86_64 Java 1.8.0_151

Steps to reproduce:

  1. select shortyear in the preferences for key autogeneration
  2. autogenerate a key for a paper from the 2000s
  3. key ends with a single-digit year

this is undesired and different from the behavior in earlier versions of jabref. double-digit years make keys easier to process.

stefan-kolb commented 6 years ago

I used

@Article{Kolb00,
  author = {Stefan Kolb},
  year   = {2000},
}

and got Kolb00

Thats what i expected. I have tested the described behavior, but cannot reproduce it. I used the latest master version. Can you post a minimal reproducible example for us?

nilsbecker commented 6 years ago

hmm. i noticed it with several autogenerated keys yesterday from an import from ISI web of science. it may have to do with the weird bracketing? here is an example

@Article{lloyd3,
  author      = {Lloyd, Alison C.},
  title       = {{The Regulation of Cell Size}},
  journal     = {{CELL}},
  year        = {{2013}},
  volume      = {{154}},
  number      = {{6}},
  pages       = {{1194-1205}},
  month       = {{SEP 12}},
  issn        = {{0092-8674}},
  abstract    = {{An adult animal consists of cells of vastly different size and activity,
   but the regulation of cell size remains poorly understood. Recent
   studies uncovering some of the signaling pathways important for
   size/growth control, together with the identification of diseases
   resulting from aberrations in these pathways, have renewed interest in
   this field. This Review will discuss our current understanding of how a
   cell sets its size, how it can adapt its size to a changing environment,
   and how these processes are relevant to human disease.}},
  doi         = {{10.1016/j.cell.2013.08.053}},
  groups      = {cell_cycle_paper},
  owner       = {nbecker},
  times-cited = {{78}},
  timestamp   = {2018.04.24},
  unique-id   = {{ISI:000324239300010}},
}
Siedlerchr commented 6 years ago

The extra curly braces indicate normally that the field is treated as is, e.g. for cooporate authors or titles, Either our importer is broken or the bibtex key data from the fetcher is wrobg

stefan-kolb commented 6 years ago

So where did you get this entry from? Can you give us the initial entry and tell us how you imported it etc.

nilsbecker commented 6 years ago

ok. go to isiknowledge.com (this may require institutional subscription). search for "title" and the exact title "the regulation of cell size". in the result page select the appropriate entry. on the entry page select 'save to other file formats' then 'bibtex' and you get a file for download named 'savedrecs.bib'. open that file with jabref, and you get the entry posted above.

stefan-kolb commented 6 years ago

Ok, thank you. I just did this and the result is the following entry.

@article{ ISI:000324239300010,
Author = {Lloyd, Alison C.},
Title = {{The Regulation of Cell Size}},
Journal = {{CELL}},
Year = {{2013}},
Volume = {{154}},
Number = {{6}},
Pages = {{1194-1205}},
Month = {{SEP 12}},
Abstract = {{An adult animal consists of cells of vastly different size and activity,
   but the regulation of cell size remains poorly understood. Recent
   studies uncovering some of the signaling pathways important for
   size/growth control, together with the identification of diseases
   resulting from aberrations in these pathways, have renewed interest in
   this field. This Review will discuss our current understanding of how a
   cell sets its size, how it can adapt its size to a changing environment,
   and how these processes are relevant to human disease.}},
Publisher = {{CELL PRESS}},
Address = {{600 TECHNOLOGY SQUARE, 5TH FLOOR, CAMBRIDGE, MA 02139 USA}},
Type = {{Review}},
Language = {{English}},
Affiliation = {{Lloyd, AC (Reprint Author), UCL, MRC Lab Mol Cell Biol, Gower St, London WC1E 6BT, England.
   Lloyd, Alison C., UCL, MRC Lab Mol Cell Biol, London WC1E 6BT, England.
   Lloyd, Alison C., UCL, UCL Canc Inst, London WC1E 6BT, England.}},
DOI = {{10.1016/j.cell.2013.08.053}},
ISSN = {{0092-8674}},
Keywords-Plus = {{SKELETAL-MUSCLE HYPERTROPHY; SACCHAROMYCES-CEREVISIAE;
   PROTEIN-SYNTHESIS; ORGAN SIZE; TRANSCRIPTION FACTORS; SIGNALING PATHWAY;
   UBIQUITIN LIGASES; MAMMALIAN-CELLS; GROWTH-CONTROL; CYCLE}},
Research-Areas = {{Biochemistry \& Molecular Biology; Cell Biology}},
Web-of-Science-Categories  = {{Biochemistry \& Molecular Biology; Cell Biology}},
Author-Email = {{alison.lloyd@ucl.ac.uk}},
Funding-Acknowledgement = {{Cancer Research, UK}},
Funding-Text = {{I apologize for citing many reviews rather than primary works due to
   space constraints. I would like to thank Sinead Roberts, Martin Raff,
   Nic Tapon, Ewa Paluch, Rob de Bruin, and Jody Rosenblatt for useful
   comments about the manuscript. My research is funded by a program grant
   from Cancer Research, UK.}},
Number-of-Cited-References = {{81}},
Times-Cited = {{78}},
Usage-Count-Last-180-days = {{3}},
Usage-Count-Since-2013 = {{79}},
Journal-ISO = {{Cell}},
Doc-Delivery-Number = {{215VQ}},
Unique-ID = {{ISI:000324239300010}},
OA = {{gold}},
DA = {{2018-04-25}},
}

So this is actually an error of the WebofScience exporter. Not sure if we can contact them and do anything about that. Most of the attributes must not be double bracketed.

nilsbecker commented 6 years ago

the author name seems to be the only field that is not bracketed. weird.

nilsbecker commented 6 years ago

I noticed that the cleanup dialog contains an option to remove outer braces. Doing this on import would repair the entry. However, since keys are generated based on the publication year, it would be necessary to deal with this before autogenerating keys(!)

A better solution would be if isi could fix their export.

stefan-kolb commented 6 years ago

@nilsbecker You are right about the clean up option which is a problem solver. But as you mentioned, we need to fix this with ISI as their entries are just wrong as it seems.

nilsbecker commented 6 years ago

i clicked my way to the FAQ section of the subsidiary that runs the ISI service (Clarivate), where they offer an explanation:

https://support.clarivate.com/WebOfScience/s/article/Web-of-Science-Double-braces-around-entries-in-the-BibTeX-export-format-in-Web-of-Science-Core-Collection-and-Inspec?language=en_US

in short, they say that they cannot do proper bibtex capitalization in titles and abstracts since the information needed to do that is not available to them. therefore they opt for forcing of all capitalization as present in their database.

it might be possible to argue about this. in any case, it's not an excuse for double-bracketing all the fields that are not title or abstract. note that the author field is exported without double brackets presently.

nilsbecker commented 6 years ago

another data point: using zotero one can obtain the bibligraphic information from the WOS website. presumably this works without any bibtex export step. after that, i can export the entry from zotero to jabref. the entry above then ends up as the following:

@article{lloyd_regulation_2013,
    title = {The {Regulation} of {Cell} {Size}},
    volume = {154},
    issn = {0092-8674},
    doi = {10.1016/j.cell.2013.08.053},
    abstract = {An adult animal consists of cells of vastly different size and activity, but the regulation of cell size remains poorly understood. Recent studies uncovering some of the signaling pathways important for size/growth control, together with the identification of diseases resulting from aberrations in these pathways, have renewed interest in this field. This Review will discuss our current understanding of how a cell sets its size, how it can adapt its size to a changing environment, and how these processes are relevant to human disease.},
    language = {English},
    number = {6},
    journal = {Cell},
    author = {Lloyd, Alison C.},
    month = sep,
    year = {2013},
    note = {WOS:000324239300010},
    keywords = {cycle, growth-control, mammalian-cells, organ size, protein-synthesis, saccharomyces-cerevisiae, signaling pathway, skeletal-muscle hypertrophy, transcription factors, ubiquitin ligases},
    pages = {1194--1205}
}

Note that there are specific capitalization choices in the title. I can't say if these come from some heuristic implemented in the bibtex export of zotero, or if they are somehow present in the WOS data already.

koppor commented 6 years ago

Refs https://twitter.com/JabRef_org/status/1012695298240442368

Siedlerchr commented 5 years ago

As I do not have an account and can't access it atm, is this issue resolved?

nilsbecker commented 5 years ago

Hm. I repeated the search for the article mentioned above, from ISI, and they have not changed their capitalization on export. So the "ISI bug" remains.

My original report was a result of this ISI bug -- autogenerating keys gives undesired results when the imported bibtex file has incorrect capitalization enforced by extra brackets.

I think as a workaround / data-sanitizing step, it would make sense for jabref to auto-remove any enclosing brackets on bibtex import in fields like: year, month volume, pages, issn, doi, number. I.e. fields where one can be reasonably sure that a global enclosing curly brace is wrong.

The real solution is of course that clarivate fix their export -- nothing seems to have happened on that front except for a friendly but non-commital retweet (see tweet above)

github-actions[bot] commented 3 years ago

This issue has been inactive for half a year. Since JabRef is constantly evolving this issue may not be relevant any longer and it will be closed in two weeks if no further activity occurs.

As part of an effort to ensure that the JabRef team is focusing on important and valid issues, we would like to ask if you could update the issue if it still persists. This could be in the following form:

Thank you for your contribution!