Closed tin-pot closed 7 years ago
Thanks for the report.
Grrr. I will have to make some kind of test suite to cover the UTF-16 soon to avoid this kind of issues. The problem is the CommonMark testsuite I currently reuse is not ready for it at all.
I haven't found this one by testing, either. The only reason I've seen it just reading the code was that I made the exact same error once and remember painfully well how much time I wasted staring at operator precedence, sign extension in sub-expressions etc until the 0x10000
shift dawned upon me - if that gives you consolation ;-)
The pertinent macros to detect and decode UTF-16 surrogate code units in
md4c.c
are (7d20152c39dbf094a774bbf34a808bf689dd2b6a):The constant
0xfc
in the first two lines should read0xfc00
.The cast to
WORD
seems pointless, the actual argument to these macros is always awchar_t
expression, which (in MSC) is promoted to 32-bitint
without sign extension. (FurthermoreWORD
is defined in<windows.h>
, and currently the only name used from there ...!). Thus the defining expression could be:The expression to compose the Unicode code point from the two 10-bit fragments omits the bias value
0x10000
; it should read:or - using only required parentheses -: