Closed golddranks closed 5 years ago
I was considering implementing the fix with a wrapping add
, but there's a problem with that: code_point
is a u16
and id_delta
is an i16
, and wrapping_add
in the stdlib isn't defined between those. In this corner case, the theoretical range of either is not superset of the other (code_point
might be over 2^15-1 and id_delta
might be negative) so one can't safely convert without widening.
I'll see if I can send some PR's to extend the code base since I might be needing that functionality.
(Btw. I signed the Google Individual Contributor License Agreement.)
Inside EncodingFormat4, code_point is limited to 16 bit values. I believe this computation might wrap and that's the behavior specified by the standard. Thus, I think keeping a 16 bit type here is appropriate.
When EncodingFormat12 is implemented, its representation of code point should definitely be a 32 bit value.
I'm happy to answer questions. Also I know @rtsuk has looked at this, I believe he's interested in correct rendering of (b/w) emoji.
Lastly, the Google CLA is not needed anymore, the ownership of this repo has now passed to me. I should probably update the READMEs and so on. But thanks!
So, I don't think there is a "wrapping conversion" of i16 to u16 or vice versa in stdlib, so I was forced to define one by a transmute. I feel like this is an omission in the stdlib, and it feels bad to introduce unsafe code for a thing like this. What do you think about the newest commit?
Are you sure the as
panics? I just checked in the playground and it doesn't. The nomicon says it's a "no-op", which is maddeningly vague; I think they mean the bit representation doesn't change, which is what I expect.
There's another way to write this which is maybe more explicit about the intent: ((code_point as i32) + (id_delta as i32) & 0xffff) as u16
. But I think just wrapping_add
should work.
Huh, I thought it would panic if it gets out of range in as
conversion. Apparently it doesn't!
Do you want me to squash?
And sorry about wasting your time, my mental model how as
works was wrong; I thought it would use the same logic as the other operations. (Panic on overflow in debug mode.)
No worries, it's good to sweat the details sometimes. I'll squash on merge as soon as CI is green. Thanks!
I'm pretty sure this was my error. Thanks for fixing it!
Hi, I encountered a bug where it crashes because of an overflow with code points larger than 2^15 on conversion from u16 to i16. Big codepoints are pretty common with Japanese Kanji so it's a serious bug.
I'm not aware of the filetype details, but generally Unicode codepoints are 21 bits width, so it strikes me odd to use u16 for those, but maybe it's a limitation of TrueType? If it isn't and codepoints outside of the basic multilingual plane aren't just supported yet, I can try and help to refactor the code base.