snowplow / snowplow-actionscript3-tracker

Snowplow event tracker for ActionScript 3.0. Add analytics to your Flash Player 9+, Flash Lite 4 and AIR games, apps and widgets
http://snowplowanalytics.com
4 stars 1 forks source link

Nature of detectSignature method #20

Closed natilivni closed 8 years ago

natilivni commented 9 years ago

I am trying to rewrite the detectSignature method from javascript in actionscript. However, some of the fields used for this signature, are sometimes unavailable in the flash environment. It the best practice, in this case, to return null for the signature or to try to build one based on those pieces of data that I am still able to get?

alexanderdean commented 9 years ago

Hey @natilivni - that's a good question! I would say we should build a signature from as many pieces of data as are available. I think the question then is whether we use the existing signature field (confusing as it's not the same signature as the JS Tracker's) or create a new field in the flash_context? @yalisassoon thoughts?

natilivni commented 9 years ago

For now, I am just implementing two separate signatures, one for cookies and the other for shared objects.

natilivni commented 9 years ago

Please notice the following three new subject params that I have added in commit 8715a620301dc4b3d0509efd348a0c3a52ade8f9 :

https://github.com/snowplow/snowplow-actionscript3-tracker/blob/master/snowplow-as3-tracker/src/com/snowplowanalytics/snowplow/tracker/Parameter.as#L46-49

They are the flash shared object storage equivalent of their corresponding cookie based parameters. @alexanderdean, please look at the naming convention and let me know if any changes should be made.

alexanderdean commented 9 years ago

Hi @natilivni - these three fields look great, however we have a freeze on adding new fields to the Tracker Protocol (preferring custom contexts instead), so these fields will need to go into the flash_context instead. Suggested names:

These are based on the names found in the atomic.events, tweaked to make them a bit more JSON-idiomatic. @yalisassoon can you review these names and make any changes?

Also: I wasn't 100% sure about referring to domain in the property names but a quick read of the documentation suggests that Flash's shared object(s) are domain-specific, so quite analogous to first-party cookies...

natilivni commented 9 years ago

Thanks @alexanderdean, I have changed the field names and moved them to the flash_context payload in commit 8787b36922814894cef27147ca9473b75fcce8a4.

@ephraimtabackman, can you please update the schema and push the patch to iglu?

ephraimtabackman commented 9 years ago

I updated the schema and pushed to branch feature/flash_context in my fork. Please pull.

alexanderdean commented 9 years ago

Done, https://github.com/snowplow/iglu-central/pull/120 This will be released later today...

natilivni commented 9 years ago

Follow up question, can someone please provide me with a signature that is know to be correct in the javascript environment and the data conditions that are used to produce it. I want to write some unit tests for the new code. Same question for domainUserId. Thanks!

alexanderdean commented 9 years ago

Hey @fblundun - do we have any unit tests that check known browser fingerprints in known browser environments (e.g. in Selenium/SauceLabs)?

A separate point I forgot to mention, apologies @natilivni - we are planning on moving the JS Tracker's domainUserId to be a UUID. It's just simpler than creating a custom ID. The ticket is here: https://github.com/snowplow/snowplow-javascript-tracker/issues/274

We might as well start the Flash domain user ID as a UUID too...

fblundun commented 9 years ago

There are some browser fingerprints used for SauceLabs tests here: https://github.com/snowplow/snowplow-javascript-tracker/blob/master/tests/functional/detectors.js#L49-L61

natilivni commented 9 years ago

@alexanderdean, I think we should coordinate that for the same milestone. It is important for users that the javascript and actionscript trackers produce the same results for shared fields, since often times, they are both part of the same web application. Therefore, until the javascript tracker actually moves to uuid, we should keep this one the same. It the timetable for that milestone short enough so that it can be release in conjunction with the actionscript tracker?

alexanderdean commented 9 years ago

Hey @natilivni - the JavaScript Tracker is blocked on moving to UUID because the domain_userid field in Redshift is too short. The ActionScript Tracker's domainUserid in the flash_context doesn't have the same limitation (because it will go into a new flash_context table that we create).

However I totally see your point - if you are able to set both the duid (i.e. cookies are functional) and the flash_context.domainUserid from the Flash Tracker, we should want to set them to the same value. So I've created a ticket here: https://github.com/snowplow/snowplow-actionscript3-tracker/issues/21

natilivni commented 9 years ago

@fblundun, do the comments in the signature test represent the full userAgent that is used in creating that signature? For example when the code says:

1627396823, // Firefox 27.0 Mac 972065378, // Firefox 27.0 XP

Does that mean that the value 1627396823 should be generated by the userAgent "Firefox 27.0 Mac" ?

fblundun commented 9 years ago

That means that the value 1627396823 is the user fingerprint generated when SauceLabs runs the tests in Firefox 27.0 on Mac. The value will be influenced by which browser plugins are present. You can see the tests here: https://saucelabs.com/u/snowplow

natilivni commented 9 years ago

I see. in that case, it is still difficult for me to test the actionascript algorithm I wrote. Since me local environment is difference than the one on saucelabs. maybe we should create a task to run the actionscript tests in saucelabs? @alexanderdean, what do you think?

alexanderdean commented 9 years ago

Hey @natilivni - I think it would be cool to run a subset of the ActionScript tests (ones that care about the DOM I guess) in SauceLabs. It might be quite fiddly, but definitely cool. Ticket is here:

https://github.com/snowplow/snowplow-actionscript3-tracker/issues/22

Definitely non-essential though...

natilivni commented 9 years ago

I have run the tests a few times, and I see that things are collecting in the "bad" bucket. Can you guys help me make sense of the results? Also, can you think of any other tests we can run for this new functionality?

alexanderdean commented 9 years ago

Hey @natilivni - okay so you are getting some failures like this:

{"schema":"iglu:com.snowplowanalytics.snowplow/payload_data/jsonschema/1-0-0","data":[{"test":"testMaxBuffer","mehh":["somebar","somebar"]}]}

this is an easy fix - just send through valid events even rather than these dummy payloads. You need to make sure to set these properties: "e", "p", "tv".

More concerning is this payload:

{
  "data": [
    {
      "eid": "e500c9fc-cb53-1210-3b69-a73e6098cfdb",
      "ti_qu": "1",
      "e": "ti",
      "schema": "iglu:com.snowplowanalytics.snowplow\/contexts\/jsonschema\/1-0-0",
      "ti_sk": "no_sku",
      "ti_cu": "USD",
      "ti_id": "order-8",
      "hasScriptAccess": false,
      "p": "pc",
      "aid": "cloudfront",
      "dtm": "1421072735697",
      "tv": "as3-0.1.0",
      "data": [
        {
          "data": {
            "someContextKey": "testTrackPageView2"
          },
          "schema": "iglu:com.snowplowanalytics.snowplow\/example\/jsonschema\/1-0-0"
        }
      ],
      "tna": "AF003",
      "hasLocalStorage": true,
      "version": "MAC 11,7,700,225",
      "ti_pr": "34",
      "ti_nm": "Big Order",
      "isDebugger": true,
      "playerType": "StandAlone",
      "ti_ca": "Food",
      "stageSize": "500x375"
    }
  ],
  "schema": "iglu:com.snowplowanalytics.snowplow\/payload_data\/jsonschema\/1-0-0"
}

This has a few things wrong with it:

  1. The "schema" field following "e" and the "data" field following "tv" should be in a JSON object {} envelope, stringified, and either base64 encoded or URI-encoded. If base64-encoded, they should be assigned to a field "cx"; if URI-encoded, they should be assigned to a field "co".
  2. The fields "hasLocalStorage", "isDebugger" etc should live inside a self-describing JSON, flash_context, which in turn lives inside a stringified JSON inside the array contained within co or cx

Think of the POST payload as containing an array of events which are each identical to a GET payload, except that every &name=value pair becomes a "name": "value" property inside the JSON.

You can see a valid POST payload here:

https://github.com/snowplow/snowplow/blob/master/3-enrich/scala-hadoop-enrich/src/test/scala/com.snowplowanalytics.snowplow.enrich.hadoop/good/CljTomcatTp2MultiEventsSpec.scala#L39

Here it is Base64-decoded:

{
  "data": [
    {
      "e": "pv",
      "tv": "py-0.5.0",
      "p": "pc",
      "dtm": "1410184736783",
      "url": "http:\/\/www.example.com",
      "cx": "eyJkYXRhIjogW3siZGF0YSI6IHsib3NUeXBlIjogIk9TWCIsICJhcHBsZUlkZnYiOiAic29tZV9hcHBsZUlkZnYiLCAib3BlbklkZmEiOiAic29tZV9JZGZhIiwgImNhcnJpZXIiOiAic29tZV9jYXJyaWVyIiwgImRldmljZU1vZGVsIjogImxhcmdlIiwgIm9zVmVyc2lvbiI6ICIzLjAuMCIsICJhcHBsZUlkZmEiOiAic29tZV9hcHBsZUlkZmEiLCAiYW5kcm9pZElkZmEiOiAic29tZV9hbmRyb2lkSWRmYSIsICJkZXZpY2VNYW51ZmFjdHVyZXIiOiAiQW1zdHJhZCJ9LCAic2NoZW1hIjogImlnbHU6Y29tLnNub3dwbG93YW5hbHl0aWNzLnNub3dwbG93L21vYmlsZV9jb250ZXh0L2pzb25zY2hlbWEvMS0wLTAifSwgeyJkYXRhIjogeyJsb25naXR1ZGUiOiAxMCwgImJlYXJpbmciOiA1MCwgInNwZWVkIjogMTYsICJhbHRpdHVkZSI6IDIwLCAiYWx0aXR1ZGVBY2N1cmFjeSI6IDAuMywgImxhdGl0dWRlTG9uZ2l0dWRlQWNjdXJhY3kiOiAwLjUsICJsYXRpdHVkZSI6IDd9LCAic2NoZW1hIjogImlnbHU6Y29tLnNub3dwbG93YW5hbHl0aWNzLnNub3dwbG93L2dlb2xvY2F0aW9uX2NvbnRleHQvanNvbnNjaGVtYS8xLTAtMCJ9XSwgInNjaGVtYSI6ICJpZ2x1OmNvbS5zbm93cGxvd2FuYWx5dGljcy5zbm93cGxvdy9jb250ZXh0cy9qc29uc2NoZW1hLzEtMC0wIn0=",
      "eid": "a91a31bc-593b-481f-869a-9d822000ee86"
    },
    {
      "se_ca": "my_category",
      "se_ac": "my_action",
      "e": "se",
      "tv": "py-0.5.0",
      "p": "pc",
      "dtm": "1410184736784",
      "cx": "eyJkYXRhIjogW3siZGF0YSI6IHsib3NUeXBlIjogIk9TWCIsICJhcHBsZUlkZnYiOiAic29tZV9hcHBsZUlkZnYiLCAib3BlbklkZmEiOiAic29tZV9JZGZhIiwgImNhcnJpZXIiOiAic29tZV9jYXJyaWVyIiwgImRldmljZU1vZGVsIjogImxhcmdlIiwgIm9zVmVyc2lvbiI6ICIzLjAuMCIsICJhcHBsZUlkZmEiOiAic29tZV9hcHBsZUlkZmEiLCAiYW5kcm9pZElkZmEiOiAic29tZV9hbmRyb2lkSWRmYSIsICJkZXZpY2VNYW51ZmFjdHVyZXIiOiAiQW1zdHJhZCJ9LCAic2NoZW1hIjogImlnbHU6Y29tLnNub3dwbG93YW5hbHl0aWNzLnNub3dwbG93L21vYmlsZV9jb250ZXh0L2pzb25zY2hlbWEvMS0wLTAifSwgeyJkYXRhIjogeyJsb25naXR1ZGUiOiAxMCwgImJlYXJpbmciOiA1MCwgInNwZWVkIjogMTYsICJhbHRpdHVkZSI6IDIwLCAiYWx0aXR1ZGVBY2N1cmFjeSI6IDAuMywgImxhdGl0dWRlTG9uZ2l0dWRlQWNjdXJhY3kiOiAwLjUsICJsYXRpdHVkZSI6IDd9LCAic2NoZW1hIjogImlnbHU6Y29tLnNub3dwbG93YW5hbHl0aWNzLnNub3dwbG93L2dlb2xvY2F0aW9uX2NvbnRleHQvanNvbnNjaGVtYS8xLTAtMCJ9XSwgInNjaGVtYSI6ICJpZ2x1OmNvbS5zbm93cGxvd2FuYWx5dGljcy5zbm93cGxvdy9jb250ZXh0cy9qc29uc2NoZW1hLzEtMC0wIn0=",
      "eid": "00da9e62-8a1c-4ffc-92a5-bc0d4fb797a9"
    },
    {
      "se_ca": "another_category",
      "se_ac": "another_action",
      "e": "se",
      "tv": "py-0.5.0",
      "p": "pc",
      "dtm": "1410184736784",
      "cx": "eyJkYXRhIjogW3siZGF0YSI6IHsib3NUeXBlIjogIk9TWCIsICJhcHBsZUlkZnYiOiAic29tZV9hcHBsZUlkZnYiLCAib3BlbklkZmEiOiAic29tZV9JZGZhIiwgImNhcnJpZXIiOiAic29tZV9jYXJyaWVyIiwgImRldmljZU1vZGVsIjogImxhcmdlIiwgIm9zVmVyc2lvbiI6ICIzLjAuMCIsICJhcHBsZUlkZmEiOiAic29tZV9hcHBsZUlkZmEiLCAiYW5kcm9pZElkZmEiOiAic29tZV9hbmRyb2lkSWRmYSIsICJkZXZpY2VNYW51ZmFjdHVyZXIiOiAiQW1zdHJhZCJ9LCAic2NoZW1hIjogImlnbHU6Y29tLnNub3dwbG93YW5hbHl0aWNzLnNub3dwbG93L21vYmlsZV9jb250ZXh0L2pzb25zY2hlbWEvMS0wLTAifSwgeyJkYXRhIjogeyJsb25naXR1ZGUiOiAxMCwgImJlYXJpbmciOiA1MCwgInNwZWVkIjogMTYsICJhbHRpdHVkZSI6IDIwLCAiYWx0aXR1ZGVBY2N1cmFjeSI6IDAuMywgImxhdGl0dWRlTG9uZ2l0dWRlQWNjdXJhY3kiOiAwLjUsICJsYXRpdHVkZSI6IDd9LCAic2NoZW1hIjogImlnbHU6Y29tLnNub3dwbG93YW5hbHl0aWNzLnNub3dwbG93L2dlb2xvY2F0aW9uX2NvbnRleHQvanNvbnNjaGVtYS8xLTAtMCJ9XSwgInNjaGVtYSI6ICJpZ2x1OmNvbS5zbm93cGxvd2FuYWx5dGljcy5zbm93cGxvdy9jb250ZXh0cy9qc29uc2NoZW1hLzEtMC0wIn0=",
      "eid": "115d56c5-ce0a-4fbe-a64f-830bc21effb8"
    }
  ],
  "schema": "iglu:com.snowplowanalytics.snowplow\/payload_data\/jsonschema\/1-0-0"
}
natilivni commented 9 years ago

Thanks @alexanderdean !

I think that those are old entries. None of the tests send the example schema any more. The entries that we were curious about are from this week or the end of last week.

alexanderdean commented 9 years ago

Ah! Sorry for confusion. I've taken a look:

To turn the question around: what are you expecting to see in Redshift that you are missing?

alexanderdean commented 8 years ago

Closed