Open sethhall opened 4 months ago
This still needs to be done (because the decode() method was clearly built with this in mind, but as a stop gap, I have a UTF-16 string reader (and it converts to utf-8 internally) implemented natively in spicy here: https://github.com/sethhall/spicy-parsers/blob/main/unicode/utf16.spicy
Hello I'd like to help solve this issue, however, I'm having difficulty trying to find where exactly the problem is. Could someone please help me locate where the decode() is?
Hello I'd like to help solve this issue, however, I'm having difficulty trying to find where exactly the problem is. Could someone please help me locate where the decode() is?
Implementing the runtime part would go roughly like the following:
UTF16
Charset
value here: https://github.com/zeek/spicy/blob/943dea8d284c3b6fd65426e6e22abce1669ceeb1/hilti/runtime/include/types/bytes.h#L42Charset::UTF16
in Bytes::decode
here: https://github.com/zeek/spicy/blob/943dea8d284c3b6fd65426e6e22abce1669ceeb1/hilti/runtime/src/types/bytes.cc#L105-L132 The C++ unit test for Bytes::decode
should also be updated here: https://github.com/zeek/spicy/blob/943dea8d284c3b6fd65426e6e22abce1669ceeb1/hilti/runtime/src/tests/bytes.cc#L64-L83To make this available in Spicy code it needs to be added to both HILTI as well as Spicy:
@Ethanholtking Did that help? Are you working on this?
It looks like the current implementation only supports ASCII and UTF-8 to decode into a string and the current library being used is strictly for UTF-8. In order to support anything with Windows roots, it would be nice to support UTF-16.
I poked around for a few minutes and found a potential small library that might work for the use case to decode UTF-16 into a string type.... https://github.com/nemtrif/utfcpp