Closed ImperialSquid closed 2 months ago
Thanks for letting me know. The problem you mentioned is an issue with Zotero and is not related to this package. I'm afraid there is nothing I can do.
Turns out, I jumped the gun and didn't do my research, whoops...
Unicode has different normalised forms for encoding (accessed through String.prototype.normalize()) and Zotero consistently normalises to NFC (decomposed into parts, also the default), whereas what I needed was NFD (decomposed then recomposed systematically, not the default)
So if anything, the two character version, while not true to the original text, is consistent within Zotero, and my _iframeWindow.getSelection()
version is a hacky work around that skips Zotero's normalisation...
Sorry for the false alarm lol I'll be sure to be more careful in the future 😅
Probably not an issue for you to fix but just to let you know the current version of Reader.getSelectedText() returns a version of the text that messes with the unicode value.
First reported on my plugin here
When selecting text with diacritics (eg "Å"), Reader.getSelectedText() incorrectly returns a two unicode character string (U+0041 (LATIN CAPITAL LETTER A) and U+030A (COMBINING RING ABOVE))
Whereas getting the selected text with something like
correctly returns a one character string (U+00C5 (LATIN CAPITAL LETTER A WITH RING ABOVE))
Also passed on to Zotero here
Since this is seemingly caused by Zotero itself and your toolkit doesn't do any form of re-encoding/etc, it's probably best fixed by them. But just in case it doesn't get fixed for some reason, I thought I'd let you know.