sheredom / utf8.h

📚 single header utf8 string functions for C and C++
The Unlicense
1.71k stars 122 forks source link

A question on casting #40

Closed MaheshVelankar closed 6 years ago

MaheshVelankar commented 6 years ago

Well, this is not an issue, it is an elementary question/request.

You have said, "..... Having it as a void forces a user to explicitly cast the utf8 string to char such that the onus is on them not to break the code anymore!...."

I am not very strong on this. Can you please give an example on how to do this casting in the code........or point me to any page where such example code is given.

Thanks

sheredom commented 6 years ago

So if you think about using the string.h functions in C/C++:

char* thing = "Hello, world!";

// Find the first w character in 'thing'
char* w = strchr(thing, 'w');

if ('w' == *w) {
  printf("Found w!\n");
}

If you did a straight port of this to utf8.h, you'd do:

char* thing = "Hello, world!";

// Find the first w character in 'thing'.
// Note the extra (char*) cast before utf8chr!
// The cast is not safe if you are looking for a non-ascii character in 'thing'!
char* w = (char*)utf8chr(thing, 'w');

if ('w' == *w) {
  printf("Found w!\n");
}

utf8 codepoints can take up to 4 characters (4 bytes) of memory to store a single character, and that is why I've explicitly used void* everywhere.

MaheshVelankar commented 6 years ago

Oh. That was very quick help and support.

I will use the above snippet and see how to code in other places.

Thanks a lot. -Mahesh

On Tue, Oct 31, 2017 at 6:27 AM, Neil Henning notifications@github.com wrote:

So if you think about using the string.h functions in C/C++:

char* thing = "Hello, world!";

// Find the first w character in 'thing' char* w = strchr(thing, 'w');

if ('w' == *w) { printf("Found w!\n"); }

If you did a straight port of this to utf8.h, you'd do:

char* thing = "Hello, world!";

// Find the first w character in 'thing'. // Note the extra (char) cast before utf8chr! // The cast is not safe if you are looking for a non-ascii character in 'thing'! char w = (char*)utf8chr(thing, 'w');

if ('w' == *w) { printf("Found w!\n"); }

utf8 codepoints can take up to 4 characters (4 bytes) of memory to store a single character, and that is why I've explicitly used void* everywhere.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/sheredom/utf8.h/issues/40#issuecomment-340721030, or mute the thread https://github.com/notifications/unsubscribe-auth/APvMy9N5A1oq4RSa_PIwhsr3y_wBX8DWks5sxvX1gaJpZM4QMgf5 .

sheredom commented 6 years ago

Happy to help! If you have any other questions please get back in touch 😄