Closed lukewilliamboswell closed 8 months ago
@lukewilliamboswell Just checking - should I hold off on review until the tests are passing? (I saw in the description you mentioned the TODOs, but I wanted to check!)
Thank you for clarifying. I think those changes will be more suited for another PR. I suspect it is going to be a challenge, at least I need to learn a lot more about emoji before then, and we may need to change the approach/algorithm to do it. If you have feedback on these changes that would be most appreciated, thank you.
Update on this PR; I've re-written the script for generating the test suite, currently called GraphemeTestGen2.roc
. Now I can filter tests to include or exclude based on the rules (or capabilities) they are testing. This is a significant improvement as now I can see where there are significant gaps in the implementation, and progressively improve support for the text segmentation rules.
I've also started on a new implementation of the algorithm for text segmentation currently called Grapheme2.roc
This PR;
roc check
,roc test
,roc build
, androc docs
NOTE implementation of Extended Grapheme Cluster requires the implementation of rules
GB9a
,GB9b
,GB9c
which are left for a future PR.Run Generation Scripts
To re-generate the generated files you can use
bash rebuild.sh
Tests
To run the tests for Grapheme test suite use
roc test package/GraphemeTest.roc
Examples
I tried to include an additional example that used
Grapheme.split
but there are significant compiler bugs that prevented me from including with this PR.Here is an demo from the tests showing the function in use.