Open richardeakin opened 8 years ago
Hm. I've been doing some stuff with emoji in JS and have learned that JS can kind of suck with Unicode in general. ES6 is better, but the day for that rewrite is not yet here.
The OSC module being used is the latest, so that's not a fix. https://github.com/TheAlphaNerd/node-osc
Do you want to do a sample app that sends what you need as an OSC blob, and we can try to tweak _handleOsc
to deal with it?
https://github.com/stimulant/ampm/blob/master/model/network.js#L163
It would be easiest for me to troubleshoot with a binary app that has a button that sends a message that repros the problem. And another button that sends it as a blob instead of a string?
In the meantime I committed a change that should at least keep it from crashing. It's on npm too.
Ah, I didn't realize that _handleOsc
is where the JSON is parsed before being sent to _logEvent
.
FWIW, if I send something like "Writing Chicago\u2019s Story" as the event's Category value (with the ampm Cinder sample), it shows up inside _handleOsc
as "Writing Chicago\u0012s Story". And JSON.parse() throws an exception [SyntaxError: Unexpected token ]
. So this is where it fails to parse the string. Not sure if there's a way to have JSON.parse handle UTF-16 data, but might be worth looking into that.
Maybe we need to do some escaping in the ampm client library? http://stackoverflow.com/a/11654338/468472
Or escape it in _handleOsc
before passing it to JSON.parse
.
@endquote for reproducing, I can on Mac with the provided cinder sample by changing this line to (not sure if this will work on windows as you can't directly write utf16 in strings until vc140):
string action = "Writing Chicago\u2019s Story";
AMPMClient::get()->sendEvent( "category", action, "label", 10 );
@LRitesh hm, so it shows up as just one char off?
I've only briefly researched the js side of things, but this gist makes me think that JSON.parse can handle UTF-16 just fine (as does cinder's jsoncpp), and the problem is likely to be in node-osc's parsing, which expects ASCII chars. Here's a [vvvv forum](https://vvvv.org/forum/encode-problem-with-strings-and-umlauts-via-udposcdecoder post) with similar issues. I believe they recommend sending the data as a blob as well, base64 encoding it first. OSC can be a pain sometimes, huh.
I don't think we need to double-escape the utf-16 chars, we're able to read down the json OK from our CMS as-is, load it in the app and display it. It's just once it gets sent over OSC that things go awry.
@endquote I tried decodeURIComponent(message[1])
before sending it to _logEvent
but running into the same issue still.
@richardeakin I was looking at that gist as well. It does sound like the issue might be in node-osc's parsing, because it's replacing \u2019 to \u0012 before JSON parse is even used on the string.
How do you get the Mac Cinder sample to build on Yosemite? I don't think environment variables are really a thing anymore. Do we need to update the readme? https://github.com/stimulant/ampm/blob/master/samples/Cinder/README.md
I haven't added anything related to app transport myself, but I've only done brief testing on Mac while our main target is win10.
An issue came up where we were trying to send an event for analytics that contained a utf-16 char, for example "Writing Chicago\u2019s Story". We can correctly draw this string in the application, however when we sent that to AMPM (using something like this function), it would crash the server with the following printed to console:
So, the json appears in node.js as malformatted. We've been investigating whether this is a problem with cinder's OSC implementation, however I don't believe it is - ci::osc just serializes the std::string as raw bytes. However, @LRitesh has pointed out that the OSC spec specifies that
OSC-string
should be only ASCII. Cinder's implementation doesn't seem to care, but perhaps the json parse in javascript does? This is the extent of my knowledge on the situation.If it turns out to be that we just aren't allowed to send unicode chars as
OSC-string
to AMPM, one solution we've thought about is whether we could send the events / logs asOSC-blob
instead, so they can be parsed with unicode chars and all on the other end.Also, am I correct in thinking that analytics.js supports utf8 / 16?