TeX-Live / texlive-source

source part of the TeX Live subversion repository - for issues please contact the tex-k mailing list at tug.org
257 stars 66 forks source link

get_real in cff_dict.c fails on some locales #18

Open simoncozens opened 5 years ago

simoncozens commented 5 years ago

See simoncozens/libtexpdf#8 and simoncozens/sile#573.

cff_dict.c parses real numbers using the strtod library function. However, when the locale is set to one where LC_NUMERIC sets the decimal point to something other than ., get_real fails because it is expecting the decimal point to be the period character.

We are planning to fix this in libtexpdf by providing a locale-independent strtod, but you will probably also want to fix this here too.

kberry commented 5 months ago

Hi Simon - thanks for this report from five years ago, which I'm afraid I didn't see until now. Are you still there? What did you end up doing in sile-typesetter?

I fear this also affects LuaTeX, where I think the non-thread-safe solution in the last answer on https://stackoverflow.com/questions/1994658 (one of the linked questions) would not be viable. Which is otherwise the approach that occurs to me. --thanks, karl.

simoncozens commented 5 months ago

Hi Karl. I don't think we fixed it yet, unfortunately.

kberry commented 5 months ago

In https://stackoverflow.com/questions/41794607 I see the suggestion to use the *_l functions which are said to be in BSD/GNU/Linux and with equivalents on Windows. Using them if available and falling back to the locale-aware strtod seems like a decent, if not perfect, workaround to me. Especially since the original bug is triggered (very) rarely.

I'm guessing none of us want to reimplement the actual parsing (I sure don't). Temporarily setting the locale to C has the thread problem already mentioned. I'm not seeing a better alternative. wdyt?

ebraminio commented 5 days ago

In harfbuzz after lots of playing around those locale specific libc functions, we went with having the parser https://github.com/harfbuzz/harfbuzz/blob/main/src/hb-number-parser.rl specially as we actually had the infrastructure to ease that up, Ragel, to generate the actual C source, https://github.com/harfbuzz/harfbuzz/blob/main/src/hb-number-parser.hh maybe that changes in the future but I actually believe it's better to not rely on such a breakable thing for that low level thing like font parsing but I guess your approach on trying locale specific functions also can be good.

kberry commented 4 days ago

Thanks for the note and code pointers. I agree it would be better in principle to have "native" parsing than relying on libc fns (I'm glad you implemented that for harfbuzz, which is a far more important case :). It might be more or less the same amount of work to reuse libc vs. implementing from scratch, depending on exactly what has to be parsed for xetex. Maybe someday I will have a chance to look into it, or someone else will come across this bug and be inspired to write a patch ... hoping ... --thanks again, karl.