sheredom / utf8.h

📚 single header utf8 string functions for C and C++
The Unlicense
1.69k stars 123 forks source link

utf8ncat - size wraparound bug #97

Closed LutzenH closed 2 years ago

LutzenH commented 2 years ago

Hello 👋! I think I found a small bug in utf8ncat when the function is executed with size_t n being 0. The function will still write all remaining bytes to the dst buffer.

for example:

utf8_int8_t dst[12] = { 'h', 'e', 'l', 'l', 'o', '\0' };
const utf8_int8_t* src = "world";
utf8ncat(dst, src, 0);

// dst will be { 'h', 'e', 'l', 'l', 'o', 'w', 'o', 'r', 'l', 'd', '\0', '\0' };

If I am not mistaken it is because size_t being unsigned which causes the following --n to wraparound:

https://github.com/sheredom/utf8.h/blob/89f6a439f7e0acf5b07ecb924911ea74de63c1ce/utf8.h#L631

I presume this is not defined behavior and that this is a bug.

sheredom commented 2 years ago

Good find! Hopefully fixed in #98.