TwilioDevEd / message-segment-calculator

JS-based tool to calculate and display message segmentation data for a given message body
https://twiliodeved.github.io/message-segment-calculator/
MIT License
85 stars 35 forks source link

Does not handle combining characters correctly #17

Closed mwchatten closed 2 years ago

mwchatten commented 2 years ago

Awesome tool. Managed to find one issue:

If you enter a character built with combining characters, the tool will correctly recognize that the message must be UCS2, and correctly count the number of unicode scalars, but will not give a correct message size or segment count (if you happen to be on the border of a new segment).

Example string: "é́́aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"

This has an "e" with 3 accents added to it. (it does not display properly in GitHub, but it will if you paste it into the tool) There are 70 characters, and 73 unicode scalars. If you actually send this message via Twilio (or any of its competitors I've tested), it will take 2 segments.

However, this tool will claim it will only take 1 segment, and claim the message size is only 1120 bits. It will also only show the code point for the plain letter "e" in the "segments" section below the entry field.

I think the tool should be updated to handle combining characters the same way Twilio actually does in practice. The triple-accented "e" is a bit of an absurd example, but single accent combined characters are a plausible use case, even if there are usually single code point alternatives. (ex. "é" (two code points) vs "é" (one code point))

vernig commented 2 years ago

Hey @mchatweave , thanks for reporting this issue! We didn't consider combining characters when we designed this tool, so thanks for bring that up. Let me do some debugging and get back to you on this. I'm a bit busy in the next couple of days, but I'll definitely give and update by end of the week.

vernig commented 2 years ago

@mchatweave I implemented a fix for this issue. Feel free to have a look and contribute to the PR

mwchatten commented 2 years ago

Awesome! Looks functional to me, I left a comment suggesting what I think is an improvement to one of your tests (my own fault, you're just using the test I suggested above, but I came up with a better one in the interim), but functionally it seems fine.

Disclaimer: have not personally functionally tested, couldn't get this library working on my personal machine.