Open ghost opened 1 year ago
So utf8upr
and utf8lwr
rely on the only codepoints we currently support for them are all symmetrically sized - their replacements are the same size. If that ever changed we'd be scunnered!
@sheredom thanks for the response. Is this documented anywhere? If not, it definitely should.
Also, what happens with the size
argument to utf8catcodepoint
? Is it correct that we pass the size of the new codepoint instead of the buffer's?
It isn't documented, so I'll do a PR. I think the size is fine only because all our replacements the size is the same between the original and the new!
Hi, I was looking at the docs for
utf8upr
/lwr
, and they don't seem to indicate what happens if the string passed to them doesn't have enough space for the new codepoints. I understand that letters may have different byte sizes in their upper/lowercase variants, so I was wondering whetherutf8upr
/lwr
will allocate extra memory as required.Looking at the code, though, it seems like they just call
utf8catcodepoint
, which AFAIK doesn't allocate additional memory. In fact, thesize
argument in that call is set to the size of the new codepoint, rather than the size of the buffer as it should be. Is this correct?