JabRef / jabref

Graphical Java application for managing BibTeX and biblatex (.bib) databases
https://devdocs.jabref.org
MIT License
3.63k stars 2.59k forks source link

error Input byte array has wrong 4-byte ending unit #11117

Closed ilippert closed 7 months ago

ilippert commented 7 months ago

JabRef version

Other (please describe below)

Operating system

Other (please describe below)

Details on version and operating system

JabRef 5.13--2024-03-31--0d97382 Linux 6.7.9-200.fc39.x86_64 amd64 Java 21.0.2 JavaFX 22+30

Checked with the latest development build (copy version output from About dialog)

Steps to reproduce the behaviour

  1. open my library file, provided upon request ;)

suggestion: provide more intelligible error message

Appendix

...

Log File ``` java.lang.IllegalArgumentException: Input byte array has wrong 4-byte ending unit at java.base/java.util.Base64$Decoder.decode0(Unknown Source) at java.base/java.util.Base64$Decoder.decode(Unknown Source) at java.base/java.util.Base64$Decoder.decode(Unknown Source) at org.jabref@5.13.326/org.jabref.logic.importer.fileformat.BibtexParser.parseField(Unknown Source) at org.jabref@5.13.326/org.jabref.logic.importer.fileformat.BibtexParser.parseEntry(Unknown Source) at org.jabref@5.13.326/org.jabref.logic.importer.fileformat.BibtexParser.parseAndAddEntry(Unknown Source) at org.jabref@5.13.326/org.jabref.logic.importer.fileformat.BibtexParser.parseFileContent(Unknown Source) at org.jabref@5.13.326/org.jabref.logic.importer.fileformat.BibtexParser.parse(Unknown Source) at org.jabref@5.13.326/org.jabref.logic.importer.fileformat.BibtexImporter.importDatabase(Unknown Source) at org.jabref@5.13.326/org.jabref.logic.importer.fileformat.BibtexImporter.importDatabase(Unknown Source) at org.jabref@5.13.326/org.jabref.logic.importer.OpenDatabase.loadDatabase(Unknown Source) at org.jabref@5.13.326/org.jabref.gui.importer.actions.OpenDatabaseAction.loadDatabase(Unknown Source) at org.jabref@5.13.326/org.jabref.gui.importer.actions.OpenDatabaseAction.lambda$openTheFile$1(Unknown Source) at org.jabref@5.13.326/org.jabref.gui.util.BackgroundTask$1.call(Unknown Source) at org.jabref@5.13.326/org.jabref.gui.util.DefaultTaskExecutor$1.call(Unknown Source) at org.jabref.merged.module@5.13.326/javafx.concurrent.Task$TaskCallable.call(Unknown Source) at java.base/java.util.concurrent.FutureTask.run(Unknown Source) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.base/java.util.concurrent.FutureTask.run(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) ```
koppor commented 7 months ago

Please send the bib file to me

Siedlerchr commented 7 months ago

Can you please send the file to web@jabref.org

Can you check the file encoding in another editor? Seems like an issue with Byte Order Mark (bom)

Ingmar Lippert @.***> schrieb am So., 31. März 2024, 15:59:

JabRef version

Other (please describe below) Operating system

Other (please describe below) Details on version and operating system

JabRef 5.13--2024-03-31--0d97382 Linux 6.7.9-200.fc39.x86_64 amd64 Java 21.0.2 JavaFX 22+30 Checked with the latest development build (copy version output from About dialog)

  • I made a backup of my libraries before testing the latest development version.
  • I have tested the latest development version and the problem persists

Steps to reproduce the behaviour

  1. open my library file, provided upon request ;)

suggestion: provide more intelligible error message Appendix

... Log File

java.lang.IllegalArgumentException: Input byte array has wrong 4-byte ending unit at java.base/java.util.Base64$Decoder.decode0(Unknown Source) at java.base/java.util.Base64$Decoder.decode(Unknown Source) at java.base/java.util.Base64$Decoder.decode(Unknown Source) at @./org.jabref.logic.importer.fileformat.BibtexParser.parseField(Unknown Source) at @./org.jabref.logic.importer.fileformat.BibtexParser.parseEntry(Unknown Source) at @./org.jabref.logic.importer.fileformat.BibtexParser.parseAndAddEntry(Unknown Source) at @./org.jabref.logic.importer.fileformat.BibtexParser.parseFileContent(Unknown Source) at @./org.jabref.logic.importer.fileformat.BibtexParser.parse(Unknown Source) at @./org.jabref.logic.importer.fileformat.BibtexImporter.importDatabase(Unknown Source) at @./org.jabref.logic.importer.fileformat.BibtexImporter.importDatabase(Unknown Source) at @./org.jabref.logic.importer.OpenDatabase.loadDatabase(Unknown Source) at @./org.jabref.gui.importer.actions.OpenDatabaseAction.loadDatabase(Unknown Source) at @./org.jabref.gui.importer.actions.OpenDatabaseAction.lambda$openTheFile$1(Unknown Source) at @./org.jabref.gui.util.BackgroundTask$1.call(Unknown Source) at @./org.jabref.gui.util.DefaultTaskExecutor$1.call(Unknown Source) at @.***/javafx.concurrent.Task$TaskCallable.call(Unknown Source) at java.base/java.util.concurrent.FutureTask.run(Unknown Source) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.base/java.util.concurrent.FutureTask.run(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source)

— Reply to this email directly, view it on GitHub https://github.com/JabRef/jabref/issues/11117, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACOFZFJMCTHLT2GOUH63MLY3AJE7AVCNFSM6AAAAABFQOFZICVHI2DSMVQWIX3LMV43ASLTON2WKOZSGIYTMOJZGQ2TAMY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Siedlerchr commented 7 months ago

Possible error could be related to the check for a bibdesk file field. Do you have a field starting with bdsk-file- somewhere in your bibtex?

https://github.com/JabRef/jabref/blob/61120e40516312949e33b08f644077cbba166f31/src/main/java/org/jabref/logic/importer/fileformat/BibtexParser.java#L745-L749

ilippert commented 7 months ago

indeed, one of the entries that seems to create problems includes

  bdsk-file-1      = {/8LSCsAAAAiPvQfR3VzdGVyc29uLkgxOTk3IFN0dWQjMjI1Njc4LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACJWeMuc6u8AAAAAAAAAAAADAAIAAAkgAAAAAAAAAAAAAAAAAAAABlBhcGVycwAQAAgAAM6T8PsAAAARAAgAAMuczs8AAAABABAAIj70ACI01wAh1dsAB8NgAAIAUE1hY2ludG9zaCBIRDpVc2VyczoAaWxpcHBlcnQ6AERvY3VtZW50czoAUGFwZXJzOgBHdXN0ZXJzb24uSDE5OTcgU3R1ZCMyMjU2NzgucGRmAA4AVAApAEcAdQBzAHQAZQByAHMAbwBuAC4ASAAxADkAOQA3ACAAUwB0AHUAZAB5AGkAbgBnACAAdQBwACAAcgBlAHYAaQBzAGkAdABlAGQALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASAElVc2Vycy9pbGlwcGVydC9Eb2N1bWVudHMvUGFwZXJzL0d1c3RlcnNvbi5IMTk5NyBTdHVkeWluZyB1cCByZXZpc2l0ZWQucGRmAAATAAEvAAAVAAIAD///AACABtIbHB0eWiRjbGFzc25hbWVYJGNsYXNzZXNdTlNNdXRhYmxlRGF0YaMdHyBWTlNEYXRhWE5TT2JqZWN00hscIiNcTlNEaWN0aW9uYXJ5oiIgXxAPTlNLZXllZEFyY2hpdmVy0SYnVHJvb3SAAQAIABEAGgAjAC0AMgA3AEAARgBNAFUAYABnAGoAbABuAHEAcwB1AHcAhACOAMoAzwDXAs8C0QLWAuEC6gL4AvwDAwMMAxEDHgMhAzMDNgM7AAAAAAAAAgEAAAAAAAAAKAAAAAAAAAAAAAAAAAAAAz0=},

gosh, the legacy of the era of using bibdesk a decade ago.

Siedlerchr commented 7 months ago

Yes, JabRef is now able to parse BibDesk files and group(s) Improved the error handling to show the related issue https://github.com/JabRef/jabref/pull/11118

ilippert commented 7 months ago

Yes, JabRef is now able to parse BibDesk files and group(s) Improved the error handling to show the related issue #11118

Sorry if I misuse the ticket for this. Does this BibDesk reading by the way extend to Skim's skim note files?

Siedlerchr commented 7 months ago

Sorry if I misuse the ticket for this. Does this BibDesk reading by the way extend to Skim's skim note files?

Never heard of this program before, I have no idea how this work, but if it's creating annotations (e.g highlighting text or comments) in pdf files, JabRef can read these (Tab File Annotations)

grafik

MacOS Preview pdf:

grafik
ilippert commented 7 months ago

Hah, as wiki might say it, once upon a time, Skim was a PDF reader, that featured non-pdf compliant annotations :(

Skim was/is developed by the BibDesk team and together it worked very well.

https://en.m.wikipedia.org/wiki/Skim_(software)

-- Sent from my LineageOS device with K-9 Mail. Please excuse my brevity.

On 31 March 2024 18:01:47 CEST, Christoph @.***> wrote:

Sorry if I misuse the ticket for this. Does this BibDesk reading by the way extend to Skim's skim note files?

Never heard of this program before, I have no idea how this work, but if it's creating annotations (e.g highlighting text or comments) in pdf files, JabRef can read these (Tab File Annotations)

grafik

MacOS Preview pdf:

grafik

-- Reply to this email directly or view it on GitHub: https://github.com/JabRef/jabref/issues/11117#issuecomment-2028806663 You are receiving this because you authored the thread.

Message ID: @.***>

Siedlerchr commented 7 months ago

According to their FAQ they save the notes in the extended file attributes on the file system (crazy), so even transferring them to a different computer would lose the info... However, seems like you can export it somehow

https://sourceforge.net/p/skim-app/wiki/FAQ/

How can I save the PDF so that notes are visible in other viewers, such as Preview and Acrobat?

Save a copy of the file with the notes included in the PDF. Choose Export... from the File menu and select PDF from the File Format popup button and select the With Embedded Notes option. Notes and highlights are now visible in other viewers, such as Preview and Acrobat Reader. Alternatively, you can Print to a file. Go to the print dialog window (command+P) and choose Print to PDF. However with both techniques, you won't be able to edit the notes and highlights in the exported copy.

ilippert commented 7 months ago

Thanks, yeah. Was never able to automate that. And transitioned away from osx a decade ago.

Maybe relevant as a usecase story: I keep the bdsk file references in the entry just as an indicator that I know there might be old notes left. And if these promise a nice secret, I power on my legacy Mac to access these files. But otherwise the bdsk file references are quite nonfunctional for me.

Am so happy I switched to JabRef and plain text annotations. -- Sent from my LineageOS device with K-9 Mail. Please excuse my brevity.

On 31 March 2024 18:15:47 CEST, Christoph @.***> wrote:

According to their FAQ they save the notes in the extended file attributes on the file system (crazy), so even transferring them to a different computer would lose the info... However, seems like you can export it somehow

https://sourceforge.net/p/skim-app/wiki/FAQ/

How can I save the PDF so that notes are visible in other viewers, such as Preview and Acrobat?

Save a copy of the file with the notes included in the PDF. Choose Export... from the File menu and select PDF from the File Format popup button and select the With Embedded Notes option. Notes and highlights are now visible in other viewers, such as Preview and Acrobat Reader. Alternatively, you can Print to a file. Go to the print dialog window (command+P) and choose Print to PDF. However with both techniques, you won't be able to edit the notes and highlights in the exported copy.

-- Reply to this email directly or view it on GitHub: https://github.com/JabRef/jabref/issues/11117#issuecomment-2028810848 You are receiving this because you authored the thread.

Message ID: @.***>

ilippert commented 7 months ago

Here is another line that causes an error

  bdsk-file-1 = {////=},

and this one

  bdsk-file-1 = {YnBsaXN0MDDUAQIDBAUGJCVYJHZlcnNpb25YJG9iamVjdHNZJGFyY2hpdmVyVCR0b3ASAAGGoKgHCBMUFRYaIVUkbnVsbNMJCgsMDxJXTlMua2V5c1pOUy5vYmplY3RzViRjbGFzc6INDoACgAOiEBGABIAFgAdccmVsYXRpdmVQYXRoWWFsaWFzRGF0YV8QVi4uLy4uLy4uL1BhcGVycy9Bc2hlaW0yMDA1IFRoZSBHZW9ncmFwaHkgb2YgSW5ub3ZhdGlvbiBSZWdpb25hbCBJbm5vdmF0aW9uIFN5c3RlbXMucGRm0hcLGBlXTlMuZGF0YU8RAkoAAAAAAkoAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAM6T/wtIKwAAACI+9B9Bc2hlaW0yMDA1IFRoZSBHZW9nciMyMjQ4QzkucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAIkjJw6jvRAAAAAAAAAAAAAMAAgAACSAAAAAAAAAAAAAAAAAAAAAGUGFwZXJzABAACAAAzpPw+wAAABEACAAAw6jhNAAAAAEAEAAiPvQAIjTXACHV2wAHw2AAAgBQTWFjaW50b3NoIEhEOlVzZXJzOgBpbGlwcGVydDoARG9jdW1lbnRzOgBQYXBlcnM6AEFzaGVpbTIwMDUgVGhlIEdlb2dyIzIyNDhDOS5wZGYADgCOAEYAQQBzAGgAZQBpAG0AMgAwADAANQAgAFQAaABlACAARwBlAG8AZwByAGEAcABoAHkAIABvAGYAIABJAG4AbgBvAHYAYQB0AGkAbwBuACAAUgBlAGcAaQBvAG4AYQBsACAASQBuAG4AbwB2AGEAdABpAG8AbgAgAFMAeQBzAHQAZQBtAHMALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASAGZVc2Vycy9pbGlwcGVydC9Eb2N1bWVudHMvUGFwZXJzL0FzaGVpbTIwMDUgVGhlIEdlb2dyYXBoeSBvZiBJbm5vdmF0aW9uIFJlZ2lvbmFsIElubm92YXRpb24gU3lzdGVtcy5wZGYAEwABLwAAFQACAA///wAAgAbSGxwdHlokY2xhc3NuYW1lWCRjbGFzc2VzXU5TTXV0YWJsZURhdGGjHR8gVk5TRGF0YVhOU09iamVjdNIbHCIjXE5TRGljdGlvbmFyeaIiIF8QD05TS2V5ZWRBcmNoaXZlctEmJ1Ryb290gAEACAARABoAIwAtADIANwBAAEYATQBVAGAAZwBqAGwAbgBxAHMAdQB3AIQAjgDnAOwA9ANCA0QDSQNUA10DawNvA3YDfwOEA5EDlAOmA6kDrgAAAAAAAAIBAAAAAAAAACgAAAAAAAAAAAAAAAAAAAOw},
Siedlerchr commented 7 months ago

You can use this version https://builds.jabref.org/pull/11118/merge It will print out all more warnings/errors

ilippert commented 7 months ago

Several things "wrong" with my data in this version

JabRef 5.13-PullRequest11118.330--2024-03-31--0c344a9 Linux 6.7.9-200.fc39.x86_64 amd64 Java 21.0.2 JavaFX 22+30

Please check your library file for wrong syntax.

Error occurred when parsing entry: 'Error in line 64104: Expected { or ( but received s'. 

JabRef skipped the entry.

this seems to be caused by the @ in this line


annote       = {read ... E.g. @supermarket: shopper learns about a new class of products in a supermarket visit. this alters the setting when she visits the supermarket next time. 152: supermarket is physically designed (by actors with specific intentions); paths of shoppers related to their expectations on how the supermarket is structured. },```

Though, this seems to be unrelated to this issue.
ilippert commented 7 months ago

You can use this version https://builds.jabref.org/pull/11118/merge It will print out all more warnings/errors

In fact, no, it does not print out more warnings/info. I tried to copy into an empty library my round 7k entries, and only round 0.5k are imported; without any notice, the other 6.5k entries are seemingly ignored.

ilippert commented 7 months ago

However, the original file does generate these messages

Please check your library file for wrong syntax.

Error occurred when parsing entry: 'Could not parse Bibdesk files content (bdsk-file..) for entry @article{Haluza-delay,
  abstract = {Research on social movements has looked primarily at activists involved in campaigns. Since the environmental movement has maintained that the everyday lifestyle of the citizen is part of the environmental problem and part of the solution, we would do well to examine also these lifestyle practices and what generates them. Using tools from Bourdieu's sociological method, this ethnographic study considers how environmental ``logic of practice'' is informed by habitus. A logic of practice is the ``feel'' for living (sens pratique) generated by internalized and ``pre-logical'' dispositions (habitus) and the social field. Another approach to explaining the operations of social movements, particularly for members, is that of ``cognitive praxis.'' In this formulation by Eyerman and Jamison, social movements create new knowledge systems. This research assesses the environmental habitus of environmentally-active persons in a region, finding several common dispositions amidst the great variety of ways of being environmentally active. These individuals tried to live in environmentally responsible ways, but were keenly aware of their inconsistencies. Being different than the dominant ways of being in contemporary society, they engaged in a variety of practices to ``self-dispose'' or non-cognitively support their environmental dispositions. However, their place in contemporary society where a routinized environmental sensitivity is contrary to the dominant or mainstream logic of practice, led to increased self-awareness. Thus, an environmental habitus could be said to include reflexivity, which appears to contradict the ``pre-logical'' description of the habitus. Reflexivity is a core part of being environmentally active, and participates in developing movement identity. The paper concludes by explaining the link between sens pratique and cognitive praxis, thereby advancing social movement theory.},
  annote = {in http://disccrs.org/dissertation_abstract?abs_id=1507},
  author = {Randolph Haluza-DeLay},
  title = {Habitus and cognitive praxis among environmentalists},
  _jabref_shared = {sharedId: -1, version: 1}
}'. 

JabRef skipped the entry.
Error occurred when parsing entry: 'Could not parse Bibdesk files content (bdsk-file..) for entry @article{Anonymous:2001p1756,
  author = {Marjolein van Asselt},
  subtitle = {From Problem to Challenge},
  title = {UNCERTAINTY IN DECISION-SUPPORT},
  _jabref_shared = {sharedId: -1, version: 1}
}'. 

JabRef skipped the entry.
Error occurred when parsing entry: 'Could not parse Bibdesk files content (bdsk-file..) for entry @article{Gusterson1997,
  author = {Hugh Gusterson},
  date = {1997-05},
  doi = {10.1525/pol.1997.20.1.114},
  journaltitle = {PoLAR: Political and Legal Anthropology Review},
  number = {1},
  pages = {114--119},
  title = {Studying up revisited},
  volume = {20},
  _jabref_shared = {sharedId: -1, version: 1}
}'. 

JabRef skipped the entry.
ilippert commented 7 months ago

the seemingly final culprit

  bdsk-file-1 = {YnBsaXN0MDDUAQIDBAUGJCVYJ HZlcnNpb25YJG9iamVjdHNZJGFyY2hpdmVyVCR0b3ASAAGGoKgHCBMUFRYaIVUkbnVsbNMJCgsMDxJXTlMua2V5c1pOUy5vYmplY3RzViRjbGFzc6INDoACgAOiEBGABIAFgAdccmVsYXRpdmVQYXRoWWFsaWFzRGF0YV8QOC4uLy4uLy4uL1BhcGVycy8yMDAxQXNzZWx0VW5jZXJ0YWludHlEZWNpc2lvblN1cHBvcnQucGRm0hcLGBlXTlMuZGF0YU8RAfAAAAAAAfAAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAM6T/wtIKwAAACI+9B8yMDAxQXNzZWx0VW5jZXJ0YWludCMyMjQxNkQucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAIkFtuUDgXgAAAAAAAAAAAAMAAgAACSAAAAAAAAAAAAAAAAAAAAAGUGFwZXJzABAACAAAzpPw+wAAABEACAAAuUDEPgAAAAEAEAAiPvQAIjTXACHV2wAHw2AAAgBQTWFjaW50b3NoIEhEOlVzZXJzOgBpbGlwcGVydDoARG9jdW1lbnRzOgBQYXBlcnM6ADIwMDFBc3NlbHRVbmNlcnRhaW50IzIyNDE2RC5wZGYADgBSACgAMgAwADAAMQBBAHMAcwBlAGwAdABVAG4AYwBlAHIAdABhAGkAbgB0AHkARABlAGMAaQBzAGkAbwBuAFMAdQBwAHAAbwByAHQALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASAEhVc2Vycy9pbGlwcGVydC9Eb2N1bWVudHMvUGFwZXJzLzIwMDFBc3NlbHRVbmNlcnRhaW50eURlY2lzaW9uU3VwcG9ydC5wZGYAEwABLwAAFQACAA///wAAgAbSGxwdHlokY2xhc3NuYW1lWCRjbGFzc2VzXU5TTXV0YWJsZURhdGGjHR8gVk5TRGF0YVhOU09iamVjdNIbHCIjXE5TRGljdGlvbmFyeaIiIF8QD05TS2V5ZWRBcmNoaXZlctEmJ1Ryb290gAEACAARABoAIwAtADIANwBAAEYATQBVAGAAZwBqAGwAbgBxAHMAdQB3AIQAjgDJAM4A1gLKAswC0QLcAuUC8wL3Av4DBwMMAxkDHAMuAzEDNgAAAAAAAAIBAAAAAAAAACgAAAAAAAAAAAAAAAAAAAM4},

thanks Christoph for the build with the warnings. helped a lot.

ilippert commented 7 months ago

I consider this closed with the https://github.com/JabRef/jabref/pull/11118 fix as it helps the user to locate the error

Siedlerchr commented 7 months ago

Thanks for your feedback. We will check the annote field parsing as well. I guess JabRef expects the at belongs to a cite key but the field should be parsed as verbatim probably. Will need to check the spec as well.

We will check the copy pasting as well. Seems like there is a different parsing logic/error handling involved

A bit of background, the bibdesk file fields are base64 encoded plist-files