Open tobil4sk opened 1 year ago
Just a little correction here, HXCPP (with smart strings enabled) never uses UTF-8 for strings. It uses ASCII and when any character in a string doesn't fit within ASCII range then the entire string is converted to UTF-16. HXCPP, however, provides a function to convert it's strings to UTF-8.
Yes, that's correct. However, an ASCII string is valid utf-8, so when using hxs_utf8 on an hxcpp string which is ASCII encoded, it just returns a pointer to the ASCII string without any conversion being necessary.
I was wondering if you point that out :) I thought you might know, my apologies for stating the obvious.
No worries, it's worth making the distinction :)
There are a few more places where wstrings are converted to utf8 still. These include most of the system functions, though these are windows only, where wchar is equal to char16_t so hl_to_utf8
/alloc_wstring
can be used there. For System::GetDirectory
, it can be modified to avoid wstrings.
The bigger issue is Font::GetFamilyName()
in text/Font.cpp
. Currently it assumes that wchar is 16 bits, which is only the case on Windows. Even if it used char16_t, the alloc_hxs_utf16()
function returns an HxString, and I can't seem to figure out to turn that into a value
.
Currently, there are a lot of redundant string conversions when using the file dialog api. This was mentioned briefly in #1622.
Here is a rough overview of the current situation:
Linux/macOS:
hxstring
->std::wstring
->std::string
, passed into tinyfiledialogs utf-8 api, thenstd::string
->std::wstring
->hxstring
hl_vstring
->std::wstring
->std::string
, passed into tinyfiledialogs utf-8 api, thenstd::string
->std::wstring
->utf-8
bytes -> nativeutf-16
hashlink string (on Haxe side viaString.fromUtf8()
)Windows:
hxstring
->std::wstring
, passed into tinyfiledialogs utf-16 api, thenstd::wstring
->hxstring
hl_vstring
->std::wstring
, passed into tinyfiledialogs utf-16 api, thenstd::wstring
->utf-8
bytes -> nativeutf-16
hashlink string (on Haxe side viaString.fromUtf8()
)This cleans things up to avoid unnecessary conversions, so now it looks like this in most cases:
Linux/macOS:
hxstring
->utf-8
[0], passed into tinyfiledialogs utf-8 api, thenutf-8
->hxstring
hl_vstring
->utf-8
, passed into tinyfiledialogs utf-8 api, thenutf-8
->utf-16
hashlink string (on Haxe side viaString.fromUtf8()
)Windows:
hxstring
->utf-16
[0], passed into tinyfiledialogs utf-16 api, thenutf-16
->hxstring
hl_vstring
, passed into tinyfiledialogs utf-16 api, thenutf-16
->utf8
->utf-16
hashlink string (on Haxe side viaString.fromUtf8()
)[0] hxcpp can use ASCII (which is compatible with utf-8) or utf-16 depending on the string, so these conversions are avoided in some cases.
We can improve Hashlink further (for Windows), by returning utf-16 strings from the apis, however, I'm not sure if this would be considered a breaking change (new lime.hdlls would be incompatible with old lime haxe files).
In #1622, we briefly discussed using the utf-8 tinyfiledialog functions on Windows to unify the api, however, I looked into this and on Windows these functions just convert back to utf-16, so there would still be extra conversions. So I think this is the cleanest solution.
I've tested on Linux and Windows with the code from #1622, and everything works well.