sheredom / utf8.h

📚 single header utf8 string functions for C and C++
The Unlicense
1.71k stars 122 forks source link

How to get codepoint of first character in iteration? #44

Closed felselva closed 6 years ago

felselva commented 6 years ago

This code is following the example of the PR https://github.com/sheredom/utf8.h/pull/21 to iterate. However, utf8codepoint is for getting the pointer and the codepoint of the next character. How can I get the codepoint of the first character?

utf8_char = utf8codepoint(utf8_string, &codepoint);
while (codepoint != '\0') {
    this_char = malloc(utf8codepointsize(codepoint) + 1);
    memset(this_char, 0, utf8codepointsize(codepoint) + 1);
    memcpy(this_char, utf8_char, utf8codepointsize(codepoint));
    printf("This char: %s\n", this_char);
    utf8_char = utf8codepoint(utf8_char, &codepoint);
}
sheredom commented 6 years ago

Hey there!

So I've worded the documentation for utf8codepoint poorly - the out_codepoint parameter returns the CURRENT codepoint in the string (EG. the codepoint that starts at the utf8_string in your example above), and the pointer returned from the function is the pointer to the next location in the utf8_string that contains a codepoint.

Is my explanation good enough, or do you want me to try again? If so I'll update the documentation to be clearer!

felselva commented 6 years ago

Oh, thanks, that solves the problem!