snowplow / snowbridge

For replicating streams across clouds, accounts and regions
Other
15 stars 7 forks source link

400 Bad Request while sending data to Google Tag Manager #294

Closed nhammad closed 11 months ago

nhammad commented 11 months ago

I have setup GTM server side following this guide: https://aws-solutions-library-samples.github.io/advertising-marketing/using-google-tag-manager-for-server-side-website-analytics-on-aws.html

Now I have two links to access my GTM server:

Primary Server-Side Container (analytics.example.com)

Preview Server Container (preview-analytics.example.com)

I am using these parameters in the aws_ecs_task_definition setup:

PreviewContainer:

  container_definitions    = <<TASK_DEFINITION
  [
  {
    "name": "preview",
    "image": "gcr.io/cloud-tagging-10302018/gtm-cloud-image",
    "environment": [
      {
        "name": "PORT",
        "value": "80"
      },
      {
        "name": "RUN_AS_PREVIEW_SERVER",
        "value": "true"
      },
      {
        "name": "CONTAINER_CONFIG",
        "value": "${var.CONTAINER_CONFIG}"
      }
    ],

PrimaryContainer:

{
    "name": "primary",
    "image": "gcr.io/cloud-tagging-10302018/gtm-cloud-image",
    "environment": [
      {
        "name": "PORT",
        "value": "80"
      },
      {
        "name": "PREVIEW_SERVER_URL",
        "value": "${var.PREVIEW_SERVER_URL}"
      },
      {
        "name": "CONTAINER_CONFIG",
        "value": "${var.CONTAINER_CONFIG}"
      }
    ]

I am trying to send data to GTM using Snowbridge, which runs using a docker container on an Ec2 instance. Snowbridge reads data from a Kinesis data stream and forwards it to GTM.

https://docs.snowplow.io/docs/destinations/forwarding-events/snowbridge/configuration/targets/http/google-tag-manager/

config.hcl.tmpl

source {
  use "kinesis" {
    stream_name = "${stream_name}"
    region      = "${region}"
    app_name    = "${app_name}"

    role_arn = "${role_arn}"
    read_throttle_delay_ms = 500

    # Maximum concurrent goroutines (lightweight threads) for message processing (default: 50)
    concurrent_writes = 50
  }
}

target {
  use "http" {
    url                        = "https://analytics.xx/com.snowplowanalytics.snowplow/enriched"
    request_timeout_in_seconds = 60
    content_type               = "application/json"

    # this line is optional, in case you want to send events to GTM Preview Mode
    headers                    = "{\"x-gtm-server-preview\": \"AAAAAAAXXXX==\"}"
  }
}

transform {
  use "spEnrichedToJson" {}
}

This works as expected and I am able to see incoming data in the preview mode. From my understanding, data is still being sent to the original mode but it is just being forwarded to the preview mode when this option is enabled.

As the next step, I want to remove the Preview Mode option. For this, I removed the following environment variable:

      {
        "name": "RUN_AS_PREVIEW_SERVER",
        "value": "true"
      },

Uptil now, Snowbridge was still sending data. However, as soon as I remove the optional header (i.e., the x-gtm-server-preview) from the config.hcl.tmpl file, I get errors on Cloudwatch while sending data:

level=warning msg="Retrying func (attempts: 2): target.Write: Error sending http requests: 1 error occurred:\n\t* Got response status: 400 Bad Request\n\n"

I also changed the environment variable for the PreviewContainer here to false:

It appears that Snowbridge is able to send data when the Preview Mode is enabled (i.e headers are given) but not otherwise. What could be possible reasons for this? The header is meant to be optional.