Open edo9300 opened 3 months ago
It's probably more efficient to use our own routines to convert between modified UTF-8 and standard UTF-8. @icculus, thoughts?
While this is technically correct, I did an audit of the call sites of these functions and none of them would have embedded nulls or characters outside the Basic Multilingual Plane, which would encode differently from standard UTF-8.
I'm going to move this out of the 3.2.0 milestone and we can revisit this later if it becomes an issue.
While this is technically correct, I did an audit of the call sites of these functions and none of them would have embedded nulls or characters outside the Basic Multilingual Plane, which would encode differently from standard UTF-8.
I'm going to move this out of the 3.2.0 milestone and we can revisit this later if it becomes an issue.
Well, it's used in places like getting/setting the clipboard, and it wouldn't be uncommon for emojis to be present there, especially on android, in which case you'd have characters outside the BMP, but I agree it's not a very pressing issue as at worst you'll only get invalid characters. Anyways, since when I opened this issue, I found an issue in the function to create a java string posted above, so i reworked it to use a different approach to create the new string, also way more simpler to use
jstring NewJavaString(JNIEnv* env, char* str, size_t len) {
jbyteArray jBuff = env->NewByteArray(len);
env->SetByteArrayRegion(jBuff, 0, len, (jbyte*)str);
jclass cls_String = env->FindClass("java/lang/String");
jmethodID String_new = env->GetMethodID(cls_String, "<init>", ([BLjava/lang/String;)V);
jstring UTF8_STRING = env->NewStringUTF("UTF-8");
// calls: String ret = new String(jBuff, "UTF-8")
jstring ret = static_cast<jstring>(env->NewObject(cls_String, String_new, jBuff, UTF8_STRING));
env->DeleteLocalRef(jBuff);
env->DeleteLocalRef(UTF8_STRING);
env->DeleteLocalRef(cls_String);
return ret;
}
I haven't investigated this, but we might be able to use iconv on Android to do this for us:
#define JavaToUTF8(S) SDL_iconv_string("UTF-8", "JAVA", (const char *)(S), (SDL_strlen(S) + 1))
#define UTF8ToJava(S) SDL_iconv_string("JAVA", "UTF-8", (const char *)(S), SDL_strlen(S) + 1)
Although that probably encodes the null byte at the end as 0xC0,0x80, so we'd need to track the length separately.
In both SDL_android.c and hid.cpp, sdl uses those 2 functions to convert from and to
jstring
, but the way they're used is incorrect. These functions work with the Modified UTF-8 encoding, that while it's almost entirely compatible with normal UTF-8, has some differences that could make the routines reject valid UTF-8 strings. A possible proper implementation would be done by usingCharset
conversion, and explicitly converting strings from/to UTF-8 bytes arrays, below are possible implementations of those conversion functions that could be used in place of GetStringUTFChars/NewStringUTF.