stimulant / ampm

application management + performance monitoring
MIT License
92 stars 25 forks source link

sending events / logs with unicode chars #24

Open richardeakin opened 8 years ago

richardeakin commented 8 years ago

An issue came up where we were trying to send an event for analytics that contained a utf-16 char, for example "Writing Chicago\u2019s Story". We can correctly draw this string in the application, however when we sent that to AMPM (using something like this function), it would crash the server with the following printed to console:

2016-06-27T21:39:30.789Z - warn: OSC messages should be JSON
C:\Users\potion-nw-wall\projects\northwestern\nw\tools\monitoring\ampm\model\logging.js:233            
this._google.event(data.Category, data.Action, data.Label, data.Value);

TypeError: Cannot read property 'Category' of null    at exports.Logging.BaseModel.extend._logEvent (C:\Users\potion-nw-wall\projects\northwestern\nw\tools\monitoring\ampm\model\logging.js:233:36)
at wrapper (C:\Users\potion-nw-wall\projects\northwestern\nw\tools\monitoring\ampm\node_modules\lodash\index.js:3095:19)
at emitOne (events.js:77:13)
at emit (events.js:169:7)
at exports.Network.BaseModel.extend._handleOsc (C:\Users\potion-nw-wall\projects\northwestern\nw\tools\monitoring\ampm\model\network.js:175:19)
at null.<anonymous> (C:\Users\potion-nw-wall\projects\northwestern\nw\tools\monitoring\ampm\model\network.js:152:22)
at wrapper (C:\Users\potion-nw-wall\projects\northwestern\nw\tools\monitoring\ampm\node_modules\lodash\index.js:3095:19)
at emitTwo (events.js:87:13)
at emit (events.js:172:7)
at Socket.<anonymous> (C:\Users\potion-nw-wall\projects\northwestern\nw\tools\monitoring\ampm\node_modules\node-osc\lib\Server.js:27:20)

[nodemon] app crashed
[nodemon] exiting
[nodemon] 1.9.2
[nodemon] to restart at any time, enter `rs`
[nodemon] ignoring: .git .nyc_output .sass-cache bower_components coverage node_modules logs C:\Users\potion-nw-wall\projects\northwestern\nw\tools\monitoring\ampm-state.json
[nodemon] watching: C:\Users\potion-nw-wall\projects\northwestern\nw\tools\monitoring\ampm.json C:\Users\potion-nw-wall\projects\northwestern\nw\tools\monitoring\ampm-restart.json
[nodemon] watching extensions: js,json
[nodemon] starting `node C:\Users\potion-nw-wall\AppData\Roaming\npm\node_modules\ampm\server.js ..\tools\monitoring\ampm.json default`
[nodemon] child pid: 11200
[nodemon] watching 2 files

Merging config: default
Merging config: default
Merging config: DESKTOP-RNH48JB
Merging config: DESKTOP-RNH48JB.default
Server starting up.
2016-06-27T21:39:32.219Z - info: App starting up.

So, the json appears in node.js as malformatted. We've been investigating whether this is a problem with cinder's OSC implementation, however I don't believe it is - ci::osc just serializes the std::string as raw bytes. However, @LRitesh has pointed out that the OSC spec specifies that OSC-string should be only ASCII. Cinder's implementation doesn't seem to care, but perhaps the json parse in javascript does? This is the extent of my knowledge on the situation.

If it turns out to be that we just aren't allowed to send unicode chars as OSC-string to AMPM, one solution we've thought about is whether we could send the events / logs as OSC-blob instead, so they can be parsed with unicode chars and all on the other end.

Also, am I correct in thinking that analytics.js supports utf8 / 16?

endquote commented 8 years ago

Hm. I've been doing some stuff with emoji in JS and have learned that JS can kind of suck with Unicode in general. ES6 is better, but the day for that rewrite is not yet here.

The OSC module being used is the latest, so that's not a fix. https://github.com/TheAlphaNerd/node-osc

Do you want to do a sample app that sends what you need as an OSC blob, and we can try to tweak _handleOsc to deal with it? https://github.com/stimulant/ampm/blob/master/model/network.js#L163

It would be easiest for me to troubleshoot with a binary app that has a button that sends a message that repros the problem. And another button that sends it as a blob instead of a string?

In the meantime I committed a change that should at least keep it from crashing. It's on npm too.

LRitesh commented 8 years ago

Ah, I didn't realize that _handleOsc is where the JSON is parsed before being sent to _logEvent.

FWIW, if I send something like "Writing Chicago\u2019s Story" as the event's Category value (with the ampm Cinder sample), it shows up inside _handleOsc as "Writing Chicago\u0012s Story". And JSON.parse() throws an exception [SyntaxError: Unexpected token ]. So this is where it fails to parse the string. Not sure if there's a way to have JSON.parse handle UTF-16 data, but might be worth looking into that.

endquote commented 8 years ago

Maybe we need to do some escaping in the ampm client library? http://stackoverflow.com/a/11654338/468472

Or escape it in _handleOsc before passing it to JSON.parse.

richardeakin commented 8 years ago

@endquote for reproducing, I can on Mac with the provided cinder sample by changing this line to (not sure if this will work on windows as you can't directly write utf16 in strings until vc140):

string action = "Writing Chicago\u2019s Story";
AMPMClient::get()->sendEvent( "category", action, "label", 10 );

@LRitesh hm, so it shows up as just one char off?

I've only briefly researched the js side of things, but this gist makes me think that JSON.parse can handle UTF-16 just fine (as does cinder's jsoncpp), and the problem is likely to be in node-osc's parsing, which expects ASCII chars. Here's a [vvvv forum](https://vvvv.org/forum/encode-problem-with-strings-and-umlauts-via-udposcdecoder post) with similar issues. I believe they recommend sending the data as a blob as well, base64 encoding it first. OSC can be a pain sometimes, huh.

I don't think we need to double-escape the utf-16 chars, we're able to read down the json OK from our CMS as-is, load it in the app and display it. It's just once it gets sent over OSC that things go awry.

LRitesh commented 8 years ago

@endquote I tried decodeURIComponent(message[1]) before sending it to _logEvent but running into the same issue still.

@richardeakin I was looking at that gist as well. It does sound like the issue might be in node-osc's parsing, because it's replacing \u2019 to \u0012 before JSON parse is even used on the string.

endquote commented 8 years ago

How do you get the Mac Cinder sample to build on Yosemite? I don't think environment variables are really a thing anymore. Do we need to update the readme? https://github.com/stimulant/ampm/blob/master/samples/Cinder/README.md

richardeakin commented 8 years ago

I haven't added anything related to app transport myself, but I've only done brief testing on Mac while our main target is win10.