Open GoogleCodeExporter opened 8 years ago
Has anyone succeeded in fixing this issue? I did try the change suggested in
this post but it didn't do the trick. My issue is that I need to deserialize a
file written in Lua containing Japanese characters into a .net string, but
after calling LuaDLL::luaL_loadBuffer, japanese chars are replaced with ???.
I think this is due to the Marshal::StringToHGlobalAnsi calls and the fact that
we are using char* everywhere (as opposed to wchar_t*).
Any idea/suggestion?
Cheers
Original comment by simonis...@gmail.com
on 14 Jun 2011 at 10:53
That's totally correct that using luaL_loadbuffer would screw things up. The
point of my original post was that you could load lua files encoded in UTF8
e.g. using loadfile in a lua script.
If you want't "full round trip" you will need to change luaL_loadstring and
luaL_loadbuffer as well. (I think those where the only two really needed)
e.g. luaL_loadbuffer would look somewhat like this:
static int luaL_loadbuffer(IntPtr luaState, String^ buff, String^ name)
{
Encoding^ enc = Encoding::UTF8;
array<Byte>^bytesBuff = enc->GetBytes(buff);
pin_ptr<Byte> pBuff = &bytesBuff[0];
array<Byte>^bytesName = enc->GetBytes(name);
pin_ptr<Byte> pName = &bytesName[0];
return ::luaL_loadbuffer(toState, (char*)pBuff, bytesBuff->Length,
(char*)pName);
}
Oh btw, i found out there is a String constructor that already does all that
encoding marshl::copy etc. you would change
return Marshal::PtrToStringAnsi [...]
to
return gcnew String(str, 0, strlen, Encoding::UTF8);
Original comment by a.wiedm...@raw-consult.com
on 16 Jun 2011 at 2:51
Thanks for answering.
Actually I worked on this yesterday and here is my version of luaL_loadbuffer
that fixes this issue:
static int luaL_loadbuffer(IntPtr luaState, String^ buff, String^ name)
{
wchar_t *cs1 = (wchar_t *) Marshal::StringToHGlobalUni(buff).ToPointer();
char *cs2 = (char *) Marshal::StringToHGlobalAnsi(name).ToPointer();
size_t sizeRequired = ::WideCharToMultiByte(CP_UTF8, 0, cs1, -1,NULL, 0, NULL, NULL);
char *szTo = new char[sizeRequired + 1];
szTo[sizeRequired] = '\0';
WideCharToMultiByte(CP_UTF8, 0, cs1, -1, szTo, (int)sizeRequired, NULL, NULL);
//CP: fix for MBCS, changed to use cs1's length (reported by qingrui.li)
int result = ::luaL_loadbuffer(toState, szTo, strlen(szTo), cs2);
Marshal::FreeHGlobal(IntPtr(cs1));
Marshal::FreeHGlobal(IntPtr(cs2));
return result;
}
cheers
Original comment by simonis...@gmail.com
on 16 Jun 2011 at 4:13
I just remembered something and thought of this thread..
For UTF8 NULL is a VALID char aswell as .NET strings can contain NULL
Try this:
string s = "foo" + "\u0000" + "bar";
byte[] buff = Encoding.UTF8.GetBytes(s);
for (int i = 0; i < buff.Length; i++)
{
Console.Write("{0:X2} ", buff[i]);
}
Console.WriteLine();
Output: 66 6F 6F 00 62 61 72
Although this might not really happen most of the time i don't think strlen is
safe to use here. I knew there was a reason i used bytesBuff->Length :)
There is no such thing a a null terminated UTF8 string
Original comment by a.wiedm...@raw-consult.com
on 27 Jun 2011 at 1:27
Original issue reported on code.google.com by
a.wiedm...@raw-consult.com
on 28 Apr 2010 at 3:26