starwing / luautf8

a utf-8 support module for Lua and LuaJIT.
MIT License
406 stars 67 forks source link

Add new 'grapheme_indices' function #47

Closed alexdowad closed 11 months ago

alexdowad commented 11 months ago

This new function allows luautf8 to be used for Unicode grapheme segmentation. Discussion is in #46.

Thanks to @Erutuon for suggesting the API for the new function.

I think I've covered all my bases; documentation, adding tests, etc.

alexdowad commented 11 months ago

Just added another test case to test.lua.

starwing commented 11 months ago

@alexdowad is there any questions for comments?

alexdowad commented 11 months ago

@alexdowad is there any questions for comments?

I believe I'm done on this one, unless you notice anything when reviewing which needs to be improved.

starwing commented 11 months ago

@alexdowad there are two comments in code, you could see whether they need fixed.

alexdowad commented 11 months ago

@alexdowad there are two comments in code, you could see whether they need fixed.

OK, I'm sorry, @starwing, I didn't understand what you meant in your previous post.

Will look at your comments. Thanks very much.

alexdowad commented 11 months ago

@starwing, I can't find your comments. Could you please share them again?

starwing commented 11 months ago

@alexdowad here: image

alexdowad commented 11 months ago

Thanks so much, just looking into those good comments...

alexdowad commented 11 months ago

I have amended the code to use byte_relat and to throw errors on invalid values of the start and end arguments.

alexdowad commented 11 months ago

@starwing Your comment about the code which handles sequences of Indic "extend", "linker", and "consonant" codepoints is also very correct. I have amended the code accordingly.

starwing commented 11 months ago

Thanks for contributing! Merged 👍