automerge / automerge-classic

A JSON-like data structure (a CRDT) that can be modified concurrently by different users, and merged again automatically.
http://automerge.org/
MIT License
14.75k stars 466 forks source link

Support creating text with unicode chars #334

Closed nemanja-tosic closed 3 years ago

nemanja-tosic commented 3 years ago

Creating a text with a unicode character results in the unicode characted being split. If used with the rust backend it results in an ugly runtime error.

See: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/split

There is a warning in there:

Warning: When the empty string ("") is used as a separator, the string is not split by user-perceived characters (grapheme clusters) or unicode characters (codepoints), but by UTF-16 codeunits. This destroys surrogate pairs. See “How do you get a string to a character array in JavaScript?” on StackOverflow.

I did not do an extensive check of the codebase for this issue, just fixed the most critical path for us at this point.

ept commented 3 years ago

I've meant to fix this but keep forgetting. Thank you for the contribution!