snowplow / snowplow-actionscript3-tracker

Snowplow event tracker for ActionScript 3.0. Add analytics to your Flash Player 9+, Flash Lite 4 and AIR games, apps and widgets
http://snowplowanalytics.com
4 stars 1 forks source link

Make sure when sending in custom context without base64 encoding it that it is url encoded #12

Closed yalisassoon closed 9 years ago

yalisassoon commented 9 years ago

I can see lines in the bad rows bucket that fail because e.g. the { character can be found on the query string, when a custom context is sent into Snowplow without being base64 encoded:

{
    "line": "2015-01-18\t12:03:39\t-\t37\t212.179.221.252\tGET\t212.179.221.252\t/i\t200\t-\tMozilla%2F5.0+%28Windows+NT+6.1%3B+WOW64%3B+Trident%2F7.0%3B+rv%3A11.0%29+like+Gecko\tti_ca=Food&ti_qu=1&tv=as3-0.1.0&aid=cloudfront&co={\"schema\":\"iglu:com.snowplowanalytics.snowplow/contexts/jsonschema/1-0-0\",\"data\":[{\"schema\":\"iglu:com.snowplowanalytics.snowplow/link_click/jsonschema/1-0-1\",\"data\":{\"targetUrl\":\"http://www.google.com\"}},{\"schema\":\"iglu:com.snowplowanalytics.snowplow/flash_context/jsonschema/1-0-0\",\"data\":{\"hasLocalStorage\":true,\"stageSize\":{\"height\":893,\"width\":1131},\"version\":\"WIN%2011,1,102,63\",\"isDebugger\":true,\"playerType\":\"ActiveX\",\"hasScriptAccess\":true}}]}&tna=AF003&ti_id=order-8&ti_sk=no_sku&ti_nm=Big%20Order&e=ti&ti_cu=USD&dtm=1421582629744&ti_pr=34&p=pc&eid=4fc4b461-fb1f-c833-0c6e-a689bd4a9cd5&cv=clj-0.9.0-tom-0.1.0&nuid=8829f62e-5376-49ab-9bcf-09793ffd2645\t-\t-\t-\t-\t-",
    "errors": [
        "Exception extracting name-value pairs from querystring [ti_ca=Food&ti_qu=1&tv=as3-0.1.0&aid=cloudfront&co={\"schema\":\"iglu:com.snowplowanalytics.snowplow/contexts/jsonschema/1-0-0\",\"data\":[{\"schema\":\"iglu:com.snowplowanalytics.snowplow/link_click/jsonschema/1-0-1\",\"data\":{\"targetUrl\":\"http://www.google.com\"}},{\"schema\":\"iglu:com.snowplowanalytics.snowplow/flash_context/jsonschema/1-0-0\",\"data\":{\"hasLocalStorage\":true,\"stageSize\":{\"height\":893,\"width\":1131},\"version\":\"WIN%2011,1,102,63\",\"isDebugger\":true,\"playerType\":\"ActiveX\",\"hasScriptAccess\":true}}]}&tna=AF003&ti_id=order-8&ti_sk=no_sku&ti_nm=Big%20Order&e=ti&ti_cu=USD&dtm=1421582629744&ti_pr=34&p=pc&eid=4fc4b461-fb1f-c833-0c6e-a689bd4a9cd5&cv=clj-0.9.0-tom-0.1.0&nuid=8829f62e-5376-49ab-9bcf-09793ffd2645] with encoding [UTF-8]: [Illegal character in query at index 68: http://localhost/?ti_ca=Food&ti_qu=1&tv=as3-0.1.0&aid=cloudfront&co={\"schema\":\"iglu:com.snowplowanalytics.snowplow/contexts/jsonschema/1-0-0\",\"data\":[{\"schema\":\"iglu:com.snowplowanalytics.snowplow/link_click/jsonschema/1-0-1\",\"data\":{\"targetUrl\":\"http://www.google.com\"}},{\"schema\":\"iglu:com.snowplowanalytics.snowplow/flash_context/jsonschema/1-0-0\",\"data\":{\"hasLocalStorage\":true,\"stageSize\":{\"height\":893,\"width\":1131},\"version\":\"WIN%2011,1,102,63\",\"isDebugger\":true,\"playerType\":\"ActiveX\",\"hasScriptAccess\":true}}]}&tna=AF003&ti_id=order-8&ti_sk=no_sku&ti_nm=Big%20Order&e=ti&ti_cu=USD&dtm=1421582629744&ti_pr=34&p=pc&eid=4fc4b461-fb1f-c833-0c6e-a689bd4a9cd5&cv=clj-0.9.0-tom-0.1.0&nuid=8829f62e-5376-49ab-9bcf-09793ffd2645]"
    ]
}
natilivni commented 9 years ago

Does this not mean that when using HTTP GET, base64Encoded is always set to true? if so, what is the purpose of allowing users to set it in the tracker constructor? Shouldn't it be based on the HTTP verb set in the emitter?

alexanderdean commented 9 years ago

When using HTTP GET, if base64Encoded is set to true, then we send the context as cx={{base64-encoded JSON}}. If base64Encoded is set to false, then we send the context as co={{uri-encoded JSON}}. Either way, we do some prep work to ensure that the querystring doesn't get corrupted...

natilivni commented 9 years ago

got it.

natilivni commented 9 years ago

Do the characters []/ also need encoding?

alexanderdean commented 9 years ago

For Base64 encoding, we use the URL-safe variant. For URI-encoding - we just use the standard.

alexanderdean commented 9 years ago

It looks like RFC 1738 frowns on { and } as well (section 2.2): http://tools.ietf.org/html/rfc1738

So I'm guessing that the Java URI decoder is failing on those, so we need to URI encode those too.

For Base64-url-safe, the doco is here: https://tools.ietf.org/html/rfc4648#page-7

natilivni commented 9 years ago

fixed in commit 00fa373e76701447a307fc9051f624236f85769d

natilivni commented 9 years ago

After we fix a bug, should we assign it to someone for testing or should we simply close it?

alexanderdean commented 9 years ago

For the 0.1.0, closing is fine