atom / telemetry

sends usage metrics to GitHub's internal analytics pipeline
MIT License
11 stars 12 forks source link

Unify format of timing and custom events #23

Closed rafeca closed 5 years ago

rafeca commented 5 years ago

Description of the Change

This PR removes the metadata sub-object from the timing events to make it consistent with the custom events so they are easier to process on the backend.

To illustrate, this is the current format of a custom event:

{
  "date": "2019-01-01T14:30:38.330Z",
  "eventType": "file",
  "cm1": 71,
  "el": "text.plain.null-grammar",
  "cm2": 95,
  "aiid": "stable",
  "cd2": "x64",
  "cd3": "x64",
  "t": "event",
  "vp": "1920x1080",
  "ea": "open",
  "ec": "file",
  "sr": "1920x1080"
}

And this is the current format of a timing event:

{
  "date": "2019-01-01T12:47:00.218Z",
  "durationInMilliseconds": 2118,
  "eventType": "load",
  "metadata": {
    "cd2": "x64",
    "cd3": "x64",
    "cm1": 47,
    "cm2": 71,
    "vp": "1920x1035",
    "ec": "shell",
    "aiid": "stable",
    "sr": "1920x1080"
  }
},

This change basically makes timing events share the same structure as the custom ones:

{
  "date": "2019-03-19T12:47:00.218Z",
  "durationInMilliseconds": 2118,
  "eventType": "load",
  "cd2": "x64",
  "cd3": "x64",
  "cm1": 47,
  "cm2": 71,
  "vp": "1920x1035",
  "ec": "shell",
  "aiid": "stable",
  "sr": "1920x1080"
},

Alternate Designs

We could have changed the custom events format (it's nice to have a metadata object with a free format and keep the top object with a fixed format), but since custom events are widely used already this change would mess up with the current metrics that we're logging.

Benefits

Possible Drawbacks

Changing the format of events is not good, since this means that we'll either have to support both formats (for old and new app versions) or we're going to not be able to process events from old apps.

Good thing is that we were not handling timing events until this same week, so if there's a good moment to do this change it is now 😄.

Also, so far we only have two types of timing events, but on next Atom version we're planning to have a few more, so the earlier that we do this change the better.

Applicable Issues

N/A

/cc @telliott27

telliott27 commented 5 years ago

Do you anticipate adding additional metadata later? One benefit of keeping the metadata object the way it is is I can convert that to a MAP object in Airflow DAG which will make querying it faster than raw JSON but also allow you to add additional elements to the metadata object without needing to update the Airflow DAG and underlying table.

rafeca commented 5 years ago

Do you anticipate adding additional metadata later? One benefit of keeping the metadata object the way it is is I can convert that to a MAP object in Airflow DAG which will make querying it faster than raw JSON but also allow you to add additional elements to the metadata object without needing to update the Airflow DAG and underlying table.

Good point! Yes we'll probably want to add additional metadata... And I agree it's a good idea to use MAP objects instead of creating new columns every time we want to add a new field...

I can leave the timings events then as they are at the moment, and then we can think about whether we ever want to migrate the custom_events to have their fields inside a metadata field like timings... sounds good?

rafeca commented 5 years ago

I'm gonna close this PR without merging and we'll keep the metadata object, which makes things easier to extend