retorquere / zotero-better-bibtex

Make Zotero effective for us LaTeX holdouts
https://retorque.re/zotero-better-bibtex/
MIT License
5.28k stars 284 forks source link

[Bug]: "Export unicode as plain-text" is ignored in the "keywords" field #2508

Closed miromarszal closed 1 year ago

miromarszal commented 1 year ago

Debug log ID

9CDHCRQV-refs-euc

What happened?

This happens on Windows, while it works as expected on Linux. This publication contains German unicode characters in the journal name and keywords. While these are correctly substituted in the journal name, they remain in keywords.

I don't know if it really affects anything since keywords are usually not printed out in a bibliography. I noticed that only because I sync the file between a Linux and a Windows machine with git, which results in spurious diffs. (As a side note, another difference I sometimes see are letters appended to a bibtex key on one machine and not on the other.)

retorquere commented 1 year ago

Can you copy-paste the keyword from Zotero, paste it in notepad, and attach the text file here? There may be encoding differences between the two. BBT doesn't do anything platform specific during exports, so it must be the case these characters are in a form that are not in my matching tables.

WRT to the citekey postfixes, citekeys are locally unique, sometimes local changes claim a citekey that would later be assigned to an item that gets synced in, and that will then get a postfix to keep them unique. To get globally unique citekeys, do select-all, right-click, better bibtex, pin citekeys, and then turn on https://retorque.re/zotero-better-bibtex/installation/preferences/#automatically-pin-citation-key-after

miromarszal commented 1 year ago

Thanks for the suggestion, I'll try that. So far refreshing the key worked most of the time.

You can find the offending keyword attached.

keyword.txt

retorquere commented 1 year ago

On both platforms: Can you open Tools -> Developer -> Run Javascript, check the "run as async" checkbox, enter the following in the left hand side of the window

return Zotero.DB.valueQueryAsync('PRAGMA encoding')

and click Run.

The right side of the window will probably display "UTF-8" on Linux, but I am curious whether it will show something else on Windows. Can you also select that entry in Zotero on Windows, right-click it, export as RDF, and zip and attach it here?

retorquere commented 1 year ago

Please also run the following on both platforms:

let utf8Encode = new TextEncoder();
return { bytes: utf8Encode.encode("<paste tag here>"), chars: "<paste tag here>".split('').map(c => [c, c.charCodeAt(0)]) }
retorquere commented 1 year ago

I can't reproduce the problem, so I'll need those things for diagnosis.

retorquere commented 1 year ago

Closing for inactivity

miromarszal commented 1 year ago

Sorry for the delay. I don't have access to the Windows machine on a regular basis. Today I wanted to submit all the data you asked for, and all the outputs seem to be identical on both systems. Then I checked if the problem is still there, and, lo and behold, everything is fine. I literally haven't touched that Win machine since the last time, so I have no idea what might have changed. Anyways, sorry for all that fuss.