Open YellowJacketLinux opened 8 months ago
This is basically what I commented on at https://tex.stackexchange.com/questions/707772/xelatex-fontspec-stylisticset-changes-underlying-unicode-characters-in-the-t#comment1759919_707772 recently. This is rather hard to avoid unless we very fundamentally change how we output mappings to Unicode like we do in harf
mode. We might have to consider doing that though, then we might want to move it out of the mode specific part and make parts of it generic. This will probably require rather heavy patching of the ConTeXt fontloader.
@u-fischer I'm guessing these things will become rather important from a tagpdf
point of view?
@YellowJacketLinux For now you can avoid the issue by using HarfBuzz mode (by adding Renderer=HarfBuzz
in fontspec
).
I can confirm the issue does not exist with Renderer=HarfBuzz
Thank you.
I first reported this bug to fontspec https://github.com/latex3/fontspec/issues/497 but was told it's an engine issue.
I hope this is the right place.
I personally don't consider this high priority because things at least visually work.
The c2sc OpenType feature is supposed to use the small-caps variant of the lower case letter where the upper case letter is requested, but the unicode code-point should still be for the upper-case variant so that copy and paste still produces an upper-case letter regardless of font features in the document being pasted into.
See the MWE and copy/paste the strings into a text editor.
In the MWE overline example, I didn't use Greek for c2sc because TeX Gyre Termes doesn't have small-caps for Greek.
But the overline example shows how typographically better it is to use c2sc with nomem sacrum (especially if the small-caps are actually a little taller than x-height although that's not shown).
If using a font with small-caps for Greek, one could even use the U+0305 combining character to make the overline (note TeX Gyre also doesn't have U+0305 but some Greek/Coptic fonts do as both scripts historically use it frequently) so that even the overline itself itself is copied and pasted---but since LuaLaTeX is using lower-case codepoints with c2sc, what gets pasted would be lower-case letters and not the upper-case that nomina sacra traditionally use.
Even though things visually work, it's possible that the engine using lower-case code points creates an issue for screen readers too, but in this use case (abbreviations) a text alt-tag should probably be used anyway, so perhaps it's not an accessibility issue but in some use cases it actually might be.
The MWE: