Fix copied string corrupt when contain CJK glyphs

Immediate-Mode-UI / Nuklear

A single-header ANSI C immediate mode cross-platform GUI library

https://immediate-mode-ui.github.io/Nuklear/doc/index.html

Other

9.06k stars 542 forks source link

Fix copied string corrupt when contain CJK glyphs #543

Open Windmill-City opened 1 year ago

dumblob commented 1 year ago

Technically this looks correct to me. But I am not sure we want to change the API from plain "buffer length" (which we do everywhere, so it is consistent) to UTF-8 specific buffer (which we do not do much if at all).

What do others think here?

If we settle on changing the API, could you also check all other demos to be consistent?

Windmill-City commented 1 year ago

I am not mean to change the API, but it seems people has mis-use this API, and passes glyphs instead of byte len to this method, so I have to rename the parameter 'len' to 'glyphs'

Windmill-City commented 1 year ago

And it's more complex to find out the byte len of the utf str for every call to it, so I sugget just change the byte len to glyths for easier use

Windmill-City commented 1 year ago

I noticed that the --stdC89 fails the CI in opengl3 build, can I change it to --stdC99 to fix this problem, or I change my code to adapt the C89 standard?

dumblob commented 1 year ago

I noticed that the --stdC89 fails the CI in opengl3 build, can I change it to --stdC99 to fix this problem, or I change my code to adapt the C89 standard?

We are totally fine with both (though we prefer C89 :wink:).

I am not mean to change the API, but it seems people has mis-use this API, and passes glyphs instead of byte len to this method, so I have to rename the parameter 'len' to 'glyphs'

Just for my better understanding of your use case - can you make a one-line wrapper of the existing byte-API for your users?

And it's more complex to find out the byte len of the utf str for every call to it, so I sugget just change the byte len to glyths for easier use

Which language are you using Nuklear in? In most languages I know of (except for Swift and a very few others), byte-lengths are the default for (str)len methods/functions.

Windmill-City commented 1 year ago

https://github.com/Immediate-Mode-UI/Nuklear/blob/25b84d101dd0ec66792a5b3a02996d5cf172712f/src/nuklear_edit.c#L289

Just for my better understanding of your use case - can you make a one-line wrapper of the existing byte-API for your users?

I search for the usage of the copy method, and find the only usage here. It passes glyphs of the selected string, not the byte length.

Windmill-City commented 1 year ago

Byte length of specific codepoint encoded in UTF-8 varies, the byte length of a 5 glyphs string may varies from 5 to 15. So you need to find out the byte length of the sub string for every call to the copy method. Isn't it more complex to introduce another substring method for utf8 string?