An end-to-end solution to collect, ingest, analyze, and visualize clickstream data inside your web and mobile applications.
This solution collects, ingests, analyzes, and visualizes clickstream events from your websites and mobile applications. Clickstream data is critical for online business analytics use cases, such as user behavior analysis, customer data platform, and marketing analysis. This data derives insights into the patterns of user interactions on a website or application, helping businesses understand user navigation, preferences, and engagement levels to drive product innovation and optimize marketing investments.
With this solution, you can quickly configure and deploy a data pipeline that fits your business and technical needs. It provides purpose-built software development kits (SDKs) that automatically collect common events and easy-to-use APIs to report custom events, enabling you to easily send your customers’ clickstream data to the data pipeline in your AWS account. The solution also offers pre-assembled dashboards that visualize key metrics about user lifecycle, including acquisition, engagement, activity, and retention, and adds visibility into user devices and geographies. You can combine user behavior data with business backend data to create a comprehensive data platform and generate insights that drive business growth.
For more information, refer to the doc.
Clickstream Analytics on AWS provides different client-side SDKs, which can make it easier for you to report events to the data pipeline created in the solution. Currently, the solution supports the following platforms:
See this repo for different kinds of SDK samples.
Follow the implementation guide to deploy the solution using AWS CloudFormation template.
npm install -g pnpm@8.15.3
pnpm install && pnpm projen && pnpm nx build @aws/clickstream-base-lib
npx cdk bootstrap
# deploy the web console of the solution
npx cdk deploy cloudfront-s3-control-plane-stack-global --parameters Email=<your email> --require-approval never
# deploy the ingestion server with s3 sink
# 1. check stack name in src/main.ts for other stacks
# 2. check the stack for required CloudFormation parameters
npx cdk deploy ingestion-server-s3-stack --parameters ...
# update the existing data modeling Redshift stack Clickstream-DataModelingRedshift-xxx
bash e2e-deploy.sh -n modelRedshiftStackName -s Clickstream-DataModelingRedshift-xxx
# update the existing web console
bash e2e-deploy.sh -n standardControlPlaneStackName -s <stack name of existing web console> -c
pnpm test
http://localhost:3000/signin
into Allowed callback URLs.src/control-plane/local
cd src/control-plane/local
# run backend server local
bash start.sh -s backend
# run frontend server local
bash start.sh -s frontend
cd src/data-pipeline/etl-common
./gradlew clean build install
cd src/data-pipeline/spark-etl
# build with unit tests
./gradlew clean build
# or only build jar and skip all unit tests
./gradlew clean build -x test -x :coverageCheck
# check the jar file
ls -l ./build/libs/spark-etl-*.jar
See CONTRIBUTING for more information.
This project is licensed under the Apache-2.0 License.
Upon successfully cloning the repository into your local development environment but prior to running the initialization script, you will see the following file structure in your editor:
├── CHANGELOG.md [Change log file]
├── CODE_OF_CONDUCT.md [Code of conduct file]
├── CONTRIBUTING.md [Contribution guide]
├── LICENSE [LICENSE for this solution]
├── NOTICE.txt [Notice for 3rd-party libraries]
├── README.md [Read me file]
├── buildspec.yml
├── cdk.json
├── codescan-prebuild-custom.sh
├── deployment [shell scripts for packaging distribution assets]
│ ├── build-open-source-dist.sh
│ ├── build-s3-dist-1.sh
│ ├── build-s3-dist.sh
│ ├── cdk-solution-helper
│ ├── post-build-1
│ ├── run-all-test.sh
│ ├── solution_config
│ ├── test
│ ├── test-build-dist.sh
│ └── test-deploy-tag-images.sh
├── docs [document]
│ ├── en
│ ├── index.html
│ ├── mkdocs.base.yml
│ ├── mkdocs.en.yml
│ ├── mkdocs.zh.yml
│ ├── site
│ ├── test-deploy-mkdocs.sh
│ └── zh
├── examples [example code]
│ ├── custom-plugins
│ └── standalone-data-generator
├── frontend [frontend source code]
│ ├── README.md
│ ├── build
│ ├── config
│ ├── esbuild.ts
│ ├── node_modules
│ ├── package.json
│ ├── public
│ ├── scripts
│ ├── src
│ ├── tsconfig.json
├── package.json
├── sonar-project.properties
├── src [all backend source code]
│ ├── alb-control-plane-stack.ts
│ ├── analytics
│ ├── base-lib
│ ├── cloudfront-control-plane-stack.ts
│ ├── common
│ ├── control-plane
│ ├── data-analytics-redshift-stack.ts
│ ├── data-modeling-athena-stack.ts
│ ├── data-pipeline
│ ├── data-pipeline-stack.ts
│ ├── data-reporting-quicksight-stack.ts
│ ├── ingestion-server
│ ├── ingestion-server-stack.ts
│ ├── kafka-s3-connector-stack.ts
│ ├── main.ts
│ ├── metrics
│ ├── metrics-stack.ts
│ └── reporting
├── test [test code]
│ ├── analytics
│ ├── common
│ ├── constants.ts
│ ├── control-plane
│ ├── data-pipeline
│ ├── ingestion-server
│ ├── jestEnv.js
│ ├── metrics
│ ├── reporting
│ ├── rules.ts
│ └── utils.ts
├── tsconfig.dev.json
├── tsconfig.json