edgee-cloud / edgee

The full-stack edge platform for your edge oriented applications.
https://www.edgee.cloud
Apache License 2.0
37 stars 6 forks source link

feat: New data layer, to collect all types of events from the edge #100

Closed SachaMorard closed 1 month ago

SachaMorard commented 1 month ago

Checklist

Description of Changes

Preparing edgee v0.4.0

__EDGEE_CONTEXT__ script tag becomes __EDGEE_DATA_LAYER__ to evolve towards standard naming. At the same time, the interface has changed radically to give developers greater control over what is captured by the edge.

Here is an example of a full and complex data layer:

<script id="__EDGEE_DATA_LAYER__" type="application/json">
  {
    "data_collection": {
      "events": [
        {
          "type": "page",
          "data": {
            "name": "With Edgee",
            "category": "demo",
            "title": "With Edgee",
            "url": "https://demo.edgee.app/with-edgee.html",
            "path": "/with-edgee.html",
            "search": "?ok",
            "keywords": [
              "demo",
              "tag manager",
              "edgee"
            ],
            "properties": {
              "section": "political",
              "pv": 1
            }
          },
          "components": {
            "all": true,
            "google_analytics": true,
            "amplitude": true,
            "facebook_capi": false
          }
        },
        {
          "type": "track",
          "data": {
            "name": "button click",
            "properties": {
              "color": "blue",
              "category": "test",
              "label": "button click"
            }
          },
          "components": {
            "all": true,
            "google_analytics": true,
            "amplitude": true,
            "facebook_capi": true
          }
        },
        {
          "type": "user",
          "data": {
            "user_id": "12345",
            "anonymous_id": "12345",
            "properties": {
              "email": "me@example.com",
              "name": "John Doe"
            }
          }
        }
      ],
      "components": {
        "all": true,
        "google_analytics": true,
        "amplitude": true,
        "facebook_capi": false
      },
      "context": {
        "page": {
          "name": "With Edgee",
          "category": "demo",
          "title": "With Edgee",
          "url": "https://demo.edgee.app/with-edgee.html",
          "path": "/with-edgee.html",
          "search": "?ok",
          "keywords": [
            "demo",
            "tag manager",
            "edgee"
          ],
          "properties": {
            "section": "political",
            "pv": 1
          }
        },
        "user": {
          "user_id": "12345",
          "anonymous_id": "12345",
          "properties": {
            "email": "me@example.com",
            "name": "John Doe"
          }
        }
      }
    }
  }
</script>

In the data layer, a first data_collection node is available:

Each event can have the following fields:

If the data_collection.context is set, Edgee will use it to gather the event.data and event.context fields. If data_collection.components is set, Edgee will use it to gather event.components field.

IMPORTANT OTHER CHANGES:

Related Issues

No issue

alexcasalboni commented 1 month ago

@SachaMorard a couple of doubts:

  1. what is the purpose of data_collection.events[].context? If something is relevant only for a specific event, then it should probably be used inside its data field, right? I cannot think of a case where we need single-event context
  2. If I define a user event, then is that also used as context for the other events, as if it was defined in data_collection.context.user? In other words, when should a customer trigger a user event vs using the user context?
SachaMorard commented 1 month ago

@alexcasalboni my answers bellow:

  1. what is the purpose of data_collection.events[].context? If something is relevant only for a specific event, then it should probably be used inside its data field, right? I cannot think of a case where we need single-event context

You're right, I don't see a case where it would be useful. But backstage, we need to copy this context into each event to facilitate the event processing. So it's here, accessible, but not useful for the developer. You can easily leave this out of the documentation.

  1. If I define a user event, then is that also used as context for the other events, as if it was defined in data_collection.context.user? In other words, when should a customer trigger a user event vs using the user context?

No, it is not. A developer has to declare a user type event when he wants to launch a dedicated user event, to store something special about the user (a name change for example). The context.user has to be used when a developer wants to attach any other event to a user. Normally, analytics technologies can assign all events to a user, as long as a “user” event is launched at some point.

alexcasalboni commented 1 month ago

I would suggest we define some kind of official JSON Schema for the new Data Layer.

This would allow us to validate schemas automatically and even provide customers with a schema validator (like somewhere in the docs).

Here's an example:

data-layer-json-schema.json

And a browser-based validator: https://www.jsonschemavalidator.net/s/EplejUWa

alexcasalboni commented 1 month ago

Also, I know we're rushing a bit, but this would be a great opportunity to add a few unit tests to make sure the parsing/validating/merging is working as expected, and that we're correctly handling what should happen in case of invalid JSON structures, etc. 😄