Add several unicode character codes that t1enc.dfu lacks

chbrown commented 7 years ago

I've lately run into several cases of pdflatex complaining about (and not rendering) undefined unicode characters due to biber writing UTF-8 output by default.

AFAICT, this is because t1enc.dfu only defines UTF-8 character code translations for a small subset of the characters that LaTeX can produce. I'm not sure why it's not more comprehensive.

This may not be the proper fix. Perhaps biber can be configured, via the biblatex-sp-unified .bbx / .cbx styles to output TeX-friendly ASCII? But I like UTF-8, and I imagine the biber developers had some reason for the choosing UTF-8 as the default, and this seems like the clearest fix assuming the .bbl file is UTF-8-encoded.

Thoughts, @fintelkai ?

chbrown commented 7 years ago

Perhaps a better intermediate solution (prior to a wholesale migration to XeLaTeX or LuaTeX) is to call biber --output-safechars sp_article_900 instead, which biber --help documents as:

Try to convert UTF-8 chars into LaTeX macros when writing the output. This can prevent unknown char errors when using PDFLaTeX and inputenc as this doesn't understand all of UTF-8. Note, it is better to switch to XeTeX or LuaTeX to avoid this situation. By default uses the --output_safecharsset "base" set of characters. The legacy option --bblsafechars is supported as an alias.

chbrown commented 7 years ago

For now, I think 'dumbing down' biber's output is the proper solution, since our LaTeX pipeline isn't UTF-8 ready at all entry points. I've added the requisite latexmk config argument on http://info.semprag.org/install.

Closing this issue, but I'll leave this branch in place for reference.

semprag / tex

Add several unicode character codes that t1enc.dfu lacks #10