JabRef / jabref

Graphical Java application for managing BibTeX and biblatex (.bib) databases
https://devdocs.jabref.org
MIT License
3.53k stars 2.47k forks source link

v5.2 High CPU/RAM when saving a large library and field formatters (normalize date) are enabled #7265

Open alfureu opened 3 years ago

alfureu commented 3 years ago

JabRef 5.2--2020-12-24--6a2a512 Windows 10 10.0 amd64 Java 14.0.2

This issue is present on the v5.2 stable.

Steps to reproduce the behavior:

  1. Create a larger file with 3000+ entries
  2. Try to save it...

Cannot help myself but it is really disappointing to see JabRef v5.2 still hogged when saving a large(r) library. I have 3000 entries, and whenever I hit Ctrl+S the computer fans start to ramp up and the status of JabRef in Task Manager goes to "Not responding"... This goes on for minutes (approx 5) until things settle (who know what is a problem with saving a txt file)!

image

This is really upsetting with a stable version, as it makes serious work almost impossible. I am sorry but default CPU usage around 20% on an 8-Core i7 7700HQ - 32GB RAM - 1TB NVMe SSD is just simply unacceptable.

alfureu commented 3 years ago

Update: in case anybody wants to hunt some bugs, I disabled in the Library properties the Enable field formatters and things got a bit better. I believe we all acknowledge that this is sub-optimal.

Ali96kz commented 3 years ago

@DOFfactory could you provide your file?

alfureu commented 3 years ago

I do not have it with me anymore. However, it happens with any file, I had multiple field formatters just to make every entry properly unified, which I disabled and since then things go smooth.

alfureu commented 3 years ago

image

Ali96kz commented 3 years ago

I am working on it, I found a problem

Ali96kz commented 3 years ago

Problems seems to be in org.jabref.model.entry.Date#parse(java.lang.String) It may throw a lot of exception, but I can't reproduce it with my files. If anyone will have the same problem, please attach your .bibfile

alfureu commented 3 years ago

I think you might be right, as I usually scrape large amounts of literature from multiple databases (Scopus, SpringerLink, WOS, etc.) with Zotero's browser addon, then I export them into the corresponding bib file(s). Now, what I noticed that various databases provide a variety of date formats, e.g. 2010, 2010-05, 2010-May, some provide separate year + month fields.

I think it has been highlighted earlier in some other bugreports that there should be somehow an automatic process of unifying the essential/required fields, which is partially provided by the field formatters (but not ideally). IMHO the date field should be one of these, meaning an irregular date field should either use the year field if it has only 2010, a year and month fields if it has 2010-05 (while day is missing) in it. Even an automated 2010-00-00 would be better than what we have now, no?

Personally I would prefer using only the year field in the main required fields section (using biblatex format), but I guess the decision has been made somewhere along the line for the date field, which is now causing issues with the overall stability of JabRef if used with field formatters.

calixtus commented 3 years ago

Hi @DOFfactory , we just merged a PR that should hopefully speed up parsing of Dates. Would you be so kind to test the current main-branch-build (we just renamed the master-branch), if that helped to relax the problem? Please make a backup of your data before testing a development build.

Thanks!

ThiloteE commented 2 years ago

meta-issue: #8906

ThiloteE commented 2 years ago

To do: