Open falschgeldkind opened 2 years ago
Would it be possible if you could provide the pdf that you imported? (As long as there is no copy-right) you also can send the file to jabref maintainers in a private e-mail (web@jabref.org).
Well, the error mesage says the problem is a brace: org.jabref.logic.exporter.SaveException: Problems saving: java.io.IOException: Error in field 'AUTHOR of entry SINGLEFINandBUFFETINGALLEVIATION0656': Braces don't match. Field value: FOR SINGLE{FIN and BUFFETING ALLEVIATION
@ThiloteE Sorry due to copyright issues I cannot share the PDF file
But @Siedlerchr is correct.
Maybe you could just check the new entries after the OCR step for any braces and discard them? Maybe even leave that field empty (because it probably is fucked up OCR and the value does not make any sense anyway)
I also have made the observation that newline characters are sometimes included in the field values and lead to problems
Maybe you could share the metadata that gets imported into JabRef before saving then. There should not be any copyright on that, right?
I assume there is a brace too many in one of the fields, therefore JabRef detects this and gives the error. If you have many entries and routinely stumble upon odd number of braces in your PDFs, then you should check your workflow and the programs you use to create them. Instead of trying to repair wrong syntax, please make sure to also prevent wrong syntax being created in the first place.
{
and }
are JabRef special characters and denote the beginning and end of a field. Imagine JabRef automatically removing an odd number of braces: text that comes after or before the brace that was removed will still be there, but not within braces, therefore will be out of place.
I just did a short test what JabRef will do, when there are no braces. The below is an example for it, but keep in mind that this is WRONG syntax and should be heavily avoided:
@Book{aii,
title = {Africa in International Politics},
groups = test,
}
after clicking on another entry in the main table, will somehow turn to:
@Book{aii,
title = {Africa in International Politics},
groups = {#test#},
}
So I guess having a mechanism that automatically removes odd number of braces upon saving would be ok, because there is a fall-back mechanism in place that tries to recreate fields?
Removing the odd number of braces would need to be done at a certain point in time, at which it is clear that the user is not working on the bibliographic entries anymore. Removing braces instantly, when users would want to ADD braces and have not finished adding the right amount of braces would be a detriment.
Btw.: JabRef's integrity check does not find {#test#},
, which should probably be the case.
I do not create any PDFs i'm just importing unlinked files (that already exist).
I think the OCR sometimes confuses normal braces with curly braces
Yes the Problem is that there sometimes are (curly) braces within the fields. Wouldn't it be enough to just delete all braces WITHIN fields or swap them out for normal braces?
I still don't know what OCR means.
Optical character recognition. Isn't that what the PDF import does to get the Author etc. if its not contained in the PDF metadata?
We don't know what is imported to JabRef and what it parses and what method it uses to import because you have failed to provide this data to us. Sorry :/ Troubleshooting this is not easy.
JabRef by default checks for data in the following order, when Grobid is enabled (but you also can disable Grobid):
1. Look for bibtex entry on first page of pdf
2. Look for embedded bib file
3. Grobid
4. XMP metadata
5. Attempt to find metadata on first page (not in bibtex format).
Source: https://discourse.jabref.org/t/extract-information-from-pdf-import/2899/6
If you say it parses optically, this then points to a Grobid issue: https://github.com/kermitt2/grobid/issues
the next time this problem turns up I'll make a screenshot or something like that and investigate with exiftool
Allright. Got another one: There seems to be no metadata in this pdf:
exiftool -ee3 -U -G3:1 -api requestall=3 -api largefilesupport $DOCPATH/Druckschriften/DGLR/JT99_102.PDF
bley@DellOptiPlex-7010:~$
mhm I cannot believe the pdf is holding absolutely zero metadata. Try this:
1. Start the commandline on the folder holding the PDF(s)
2. exiftool.exe has to be in this folder
3. Use the following command:
exiftool -ee3 -U -G3:1 -api requestall=3 -api largefilesupport FILE
Also, please show the bibtex source tab. sometimes what is shown in other tabs diverges from what is shown in bibtex source.
But we already can see that there indeed is a curly brace opening, but not closing, so the immediate workaround would be to remove that curly brace or add another curly brace to close the argument and the error should be gone.
I use linux. So no exe :D
how do I get the bibtex source tab?
Ah right. My bad.
The {} biblatex source
tab is one of the tabs of the entry editor in JabRef.
another one with a similar error:
bley@DellOptiPlex-7010:/home/pfisun8n/allgem0/08_Literatur/Dokumente/Druckschriften/EUCASS$ exiftool -ee3 -U -G3:1 -api requestall=3 -api largefilesupport a169.pdf
bley@DellOptiPlex-7010:/home/pfisun8n/allgem0/08_Literatur/Dokumente/Druckschriften/EUCASS$
the source tab says this:
Error in field 'AUTHOR of entry ГпуезИаНоп1310': Braces don't match. Field value: Ап ГпуезИ^аНоп and оГ Бупапнс and о( Огйегей and 81гис1иге ш Ехсйей and Зе1 изш РГУ and - Ъазес and рЬазе- and ауега§1П§ {есЬшцие.
Correct the entry, and reopen editor to display/edit source.
The immediate workaround would be to remove that curly brace or add another curly brace to close the argument, and the error should be gone.
Another workaround would be to try to change your workflow and have bibliographic data in JabRef and then to use Quality > automatically set file links (F7)
. The advantage of this method is that you can add the correct bibliographic data manually or download or import if from somewhere else and then to just attach the pdf to the correct data. In other words: You will have less work dealing with correcting the wrong bibliographic data that was parsed from the pdf, because the parsing is far from perfect, as you can see.
If you don't have bibliographic data at hand you also can disable grobid and import XMP metadata via file > import > ...
first. Choose the following:
Afterwards you can then automatically set file links (F7). I personally use a regex to find files in my system, but you also can use the citationkey or name your files after the DOI. There are nice preferences:
Of course, if you do not have bibliographic data at hand at all and if there is no metadata attached to the pdf, the second workaround may not work well for you.
Thank you.
I do not have the bibliographic data unfortunately
JabRef version
Other (please describe below)
Operating system
GNU / Linux
Details on version and operating system
No response
Checked with the latest development build
Steps to reproduce the behaviour
Jabref Version: latest (main) dev build (today) but this bug exists in previous versions as well.
Saving throws an exception after importing PDF Files. Probably has to do with OCR reading some special characters like { } \n etc. Trying to save them then leads to a Situation where Jabref can't decide where the entry ends or something like that
To reproduce:
Exception in the appendix
Appendix
...