rudderlabs / community-user-transformations

MIT License
1 stars 1 forks source link

Properties Schema Validation #14

Open ifoukarakis opened 1 year ago

ifoukarakis commented 1 year ago

Contact Details

ioannis.foukarakis@gmail.com

Language

Javascript

Category

Data Security & Governance

Description

Properties Schema Validation

Description

Asserts the event's properties follow the constraints of a JSON schema.

One of the most common challenges when gathering data is agreement between all involved stakeholders on the format, structure, and semantics of data. One popular solution is to apply "data contracts" to ensure that different systems or components communicate effectively and accurately. This transformation uses JSON Schema specification to assert that event's properties obey the agreed constraints.

Events that fail to comply with the agreed constraint are currently dropped, but logic can be modified to re-route them to a DLQ-like destination.

Deployment

  1. Add the contents of jsonschema.js in a library (instructions). IMPORTANT: Make sure that the name of the library is jsonschema.
  2. Add the code from the code block to a new Javascript validation (instructions).
  3. Make sure that the configuration matches your expectations (see following section).
  4. Connect the new transformation to destination (instructions).

Notes/Troubleshooting

Configuration

Register schemas for event

Schemas can be loaded either from URLs or from JSON objects. For example, the following validation function registers two schemas for two events from two different URLs:

export async function transformBatch(events, metadata) {
    // Create a registry for schemas
    const contracts = new Contracts();
    // Register schema for event "Add To Cart" from a URL
    await contracts.registerSchemaFromURL("Add To Cart", "https://raw.githubusercontent.com/ifoukarakis/tests/main/product.json");
    // Register a different schema for event "User Registered" from a URL
    await contracts.registerSchemaFromURL("User Registered", "https://raw.githubusercontent.com/ifoukarakis/tests/main/person.json");
    // Register more events here.

    return events.filter(event => contracts.validateProperties(event))
}

In the following example, a single schema is registered from a JSON Object:

const productSchema = {
    "$id": "https://example.com/person.schema.json",
    "$schema": "https://json-schema.org/draft/2020-12/schema",
    "title": "Product",
    "type": "object",
    "properties": {
        "product_id": {
            "type": "string",
            "description": "The product's ID."
        },
        "name": {
            "type": "string",
            "description": "The person's last name."
        },
        "price": {
            "type": "string",
            "pattern": "^(0|([1-9]+[0-9]*))(\\.[0-9]{1,2})?$",
            "minLength": 1,
            "description": "The product's price.",
            "examples": [
                "0",
                "0.00",
                "0.05",
                "19.95",
                "255.5",
                "120000"
            ]
        }
    },
    "required": ["product_id", "name", "price"]
}

export async function transformBatch(events, metadata) {
    const contracts = new Contracts(true);
    await contracts.registerSchemaFromJSON("Add To Cart", productSchema);

    // Register more events here.
    return events.filter(event => contracts.validateProperties(event))
}

Note: embedding the JSON schemas on the transformation's code should help improve performance, but might reduce readability of the code. Another approach would be to move schemas in to a separate library file.

Drop events in case there's no schema registered

Simply add false as an argument to Contracts constructor:

const contracts = new Contracts(true);

Developing

A sample github project for writing transformations is available at https://github.com/ifoukarakis/rudderstack-transformations.

Code Block

import { Schema } from 'jsonschema';

/*
Class responsible for managing contracts.
*/
export class Contracts {
    /**
     * Create a new contracts instance.
     * 
     * @param {Boolean} allowUnregisteredEvents whether to allow unknown events or not.
     */
    constructor(allowUnregisteredEvents=true) {
        this.schemas = {};
        this.allowUnregisteredEvents = allowUnregisteredEvents;
    }

    async registerSchemaFromJSON(event, schema) {
        this.schemas[event] = new Schema(schema);
    }

    async registerSchemaFromURL(event, url) {
        const response = await fetch(url);
        this.schemas[event] = new Schema(response);
    }

    validateProperties(event) {
        const schema = this.schemas[event.event];
        if(schema) return schema.validate(event.properties);

        // If unregistered event, fallback. 
        return this.allowUnregisteredEvents;
    }
}

const productSchema = {
    "$id": "https://example.com/person.schema.json",
    "$schema": "https://json-schema.org/draft/2020-12/schema",
    "title": "Product",
    "type": "object",
    "properties": {
        "product_id": {
            "type": "string",
            "description": "The product's ID."
        },
        "name": {
            "type": "string",
            "description": "The person's last name."
        },
        "price": {
            "type": "string",
            "pattern": "^(0|([1-9]+[0-9]*))(\\.[0-9]{1,2})?$",
            "minLength": 1,
            "description": "The product's price.",
            "examples": [
                "0",
                "0.00",
                "0.05",
                "19.95",
                "255.5",
                "120000"
            ]
        }
    },
    "required": ["product_id", "name", "price"]
}

export async function transformBatch(events, metadata) {  // eslint-disable-line no-unused-vars
    // Replace following line with const contracts = new Contracts(false); if you want to consider unregistered events as invalid.
    const contracts = new Contracts(true);
    // Register events here
    await contracts.registerSchemaFromJSON("Add To Cart", productSchema);

    // Register more events here.
    return events.filter(event => contracts.validateProperties(event))
}

Input Payload for testing

Adding multiple test cases. See `_comment` field in each event for more details.

[
  {
    "_comment": "valid event",
    "anonymousId": "8d872292709c6fbe",
    "channel": "mobile",
    "context": {
      "app": {
        "build": "1",
        "name": "AMTestProject",
        "namespace": "com.rudderstack.android.rudderstack.sampleAndroidApp",
        "version": "1.0"
      },
      "device": {
        "id": "8d872292709c6fbe",
        "manufacturer": "Google",
        "model": "AOSPonIAEmulator",
        "name": "generic_x86_arm",
        "type": "android"
      },
      "library": {
        "name": "com.rudderstack.android.sdk.core",
        "version": "1.0.2"
      },
      "locale": "en-US",
      "network": {
        "carrier": "Android",
        "bluetooth": false,
        "cellular": true,
        "wifi": true
      },
      "os": {
        "name": "Android",
        "version": "9"
      },
      "screen": {
        "density": 420,
        "height": 1794,
        "width": 1080
      },
      "timezone": "Asia/Kolkata",
      "traits": {
        "address": {
          "city": "Kolkata",
          "country": "India",
          "postalcode": "700096",
          "state": "West bengal",
          "street": "Park Street"
        },
        "age": "30",
        "anonymousId": "8d872292709c6fbe",
        "birthday": "2020-05-26",
        "createdat": "18th March 2020",
        "description": "Premium User for 3 years",
        "email": "identify@test.com",
        "firstname": "John",
        "userId": "sample_user_id",
        "lastname": "Sparrow",
        "name": "John Sparrow",
        "id": "sample_user_id",
        "phone": "9112340345",
        "username": "john_sparrow"
      },
      "userAgent": "Dalvik/2.1.0 (Linux; U; Android 9; AOSP on IA Emulator Build/PSR1.180720.117)"
    },
    "event": "Add To Cart",
    "integrations": {
      "All": true
    },
    "messageId": "1590431830915-73bed370-5889-436d-9a9e-0c0e0c809d06",
    "properties": {
      "product_id": "SKU-10001",
      "name": "The Fellowship Of The Ring",
      "price": "58.00"
    },
    "originalTimestamp": "2020-05-25T18:37:10.917Z",
    "type": "track",
    "userId": "sample_user_id"
  },
  {
    "_comment": "invalid price field - 3 deciman items",
    "anonymousId": "8d872292709c6fbe",
    "channel": "mobile",
    "context": {
      "app": {
        "build": "1",
        "name": "AMTestProject",
        "namespace": "com.rudderstack.android.rudderstack.sampleAndroidApp",
        "version": "1.0"
      },
      "device": {
        "id": "8d872292709c6fbe",
        "manufacturer": "Google",
        "model": "AOSPonIAEmulator",
        "name": "generic_x86_arm",
        "type": "android"
      },
      "library": {
        "name": "com.rudderstack.android.sdk.core",
        "version": "1.0.2"
      },
      "locale": "en-US",
      "network": {
        "carrier": "Android",
        "bluetooth": false,
        "cellular": true,
        "wifi": true
      },
      "os": {
        "name": "Android",
        "version": "9"
      },
      "screen": {
        "density": 420,
        "height": 1794,
        "width": 1080
      },
      "timezone": "Asia/Kolkata",
      "traits": {
        "address": {
          "city": "Kolkata",
          "country": "India",
          "postalcode": "700096",
          "state": "West bengal",
          "street": "Park Street"
        },
        "age": "30",
        "anonymousId": "8d872292709c6fbe",
        "birthday": "2020-05-26",
        "createdat": "18th March 2020",
        "description": "Premium User for 3 years",
        "email": "identify@test.com",
        "firstname": "John",
        "userId": "sample_user_id",
        "lastname": "Sparrow",
        "name": "John Sparrow",
        "id": "sample_user_id",
        "phone": "9112340345",
        "username": "john_sparrow"
      },
      "userAgent": "Dalvik/2.1.0 (Linux; U; Android 9; AOSP on IA Emulator Build/PSR1.180720.117)"
    },
    "event": "Add To Cart",
    "integrations": {
      "All": true
    },
    "messageId": "1590431830915-73bed370-5889-436d-9a9e-0c0e0c809d06",
    "properties": {
      "product_id": "SKU-10001",
      "name": "The Fellowship Of The Ring",
      "price": "58.202"
    },
    "originalTimestamp": "2020-05-25T18:37:10.917Z",
    "type": "track",
    "userId": "sample_user_id"
  },
  {
    "_comment": "missing product_id property",
    "anonymousId": "8d872292709c6fbe",
    "channel": "mobile",
    "context": {
      "app": {
        "build": "1",
        "name": "AMTestProject",
        "namespace": "com.rudderstack.android.rudderstack.sampleAndroidApp",
        "version": "1.0"
      },
      "device": {
        "id": "8d872292709c6fbe",
        "manufacturer": "Google",
        "model": "AOSPonIAEmulator",
        "name": "generic_x86_arm",
        "type": "android"
      },
      "library": {
        "name": "com.rudderstack.android.sdk.core",
        "version": "1.0.2"
      },
      "locale": "en-US",
      "network": {
        "carrier": "Android",
        "bluetooth": false,
        "cellular": true,
        "wifi": true
      },
      "os": {
        "name": "Android",
        "version": "9"
      },
      "screen": {
        "density": 420,
        "height": 1794,
        "width": 1080
      },
      "timezone": "Asia/Kolkata",
      "traits": {
        "anonymousId": "8d872292709c6fbe"
      },
      "userAgent": "Dalvik/2.1.0 (Linux; U; Android 9; AOSP on IA Emulator Build/PSR1.180720.117)"
    },
    "event": "Add To Cart",
    "integrations": {
      "All": true
    },
    "messageId": "1590431830915-73bed370-5889-436d-9a9e-0c0e0c809d06",
    "properties": {
      "name": "The Two Towers",
      "price": "45.00"
    },
    "originalTimestamp": "2020-05-25T18:37:10.917Z",
    "type": "track",
    "userId": "sample_user_id"
  },
  {
    "_comment": "event with no registered schema",
    "anonymousId": "8d872292709c6fbe",
    "channel": "mobile",
    "context": {
      "app": {
        "build": "1",
        "name": "AMTestProject",
        "namespace": "com.rudderstack.android.rudderstack.sampleAndroidApp",
        "version": "1.0"
      },
      "device": {
        "id": "8d872292709c6fbe",
        "manufacturer": "Google",
        "model": "AOSPonIAEmulator",
        "name": "generic_x86_arm",
        "type": "android"
      },
      "library": {
        "name": "com.rudderstack.android.sdk.core",
        "version": "1.0.2"
      },
      "locale": "en-US",
      "network": {
        "carrier": "Android",
        "bluetooth": false,
        "cellular": true,
        "wifi": true
      },
      "os": {
        "name": "Android",
        "version": "9"
      },
      "screen": {
        "density": 420,
        "height": 1794,
        "width": 1080
      },
      "timezone": "Asia/Kolkata",
      "traits": {
        "address": {
          "city": "Kolkata",
          "country": "India",
          "postalcode": "700096",
          "state": "West bengal",
          "street": "Park Street"
        },
        "age": "30",
        "anonymousId": "8d872292709c6fbe",
        "birthday": "2020-05-26",
        "createdat": "18th March 2020",
        "description": "Premium User for 3 years",
        "email": "identify@test.com",
        "firstname": "John",
        "userId": "sample_user_id",
        "lastname": "Sparrow",
        "name": "John Sparrow",
        "id": "sample_user_id",
        "phone": "9112340345",
        "username": "john_sparrow"
      },
      "userAgent": "Dalvik/2.1.0 (Linux; U; Android 9; AOSP on IA Emulator Build/PSR1.180720.117)"
    },
    "event": "Product clicked",
    "integrations": {
      "All": true
    },
    "messageId": "1590431830915-73bed370-5889-436d-9a9e-0c0e0c809d06",
    "properties": {
      "product_id": "SKU-10001"
    },
    "originalTimestamp": "2020-05-25T18:37:10.917Z",
    "type": "track",
    "userId": "sample_user_id"
  }
]

License

gitcommitshow commented 1 year ago

Thank you for contributing to RudderStack Transformations. Your submission will be reviewed soon. Do follow the transformations-challenge channel on RudderStack slack community for updates on the challenge.