Streaming data use cases follow a similar pattern where data flows from data producers through streaming storage and data consumers to storage destinations. Sources continuously generate data, which is delivered via the ingest stage to the stream storage layer, where it's durably captured and made available for streaming processing. The stream processing layer processes the data in the stream storage layer and sends the processed information to a specified destination.
The challenge with these use cases is the set up time and effort that developers require to create the resources and establish the best practices needed by the streaming data services (such as access control, logging capabilities, and data integrations).
The Streaming Data Solution for Amazon Kinesis and Streaming Data Solution for Amazon MSK automatically configure the AWS services necessary to easily capture, store, process, and deliver streaming data. They provide common streaming data patterns for you to choose from that can serve as a starting point for solving your use case or to improve existing applications. You can try out new service combinations to implement common streaming data use cases, or use the solutions as the basis for your production environment.
AWS CDK Solutions Constructs make it easier to consistently create well-architected applications. All AWS Solutions Constructs are reviewed by AWS and use best practices established by the AWS Well-Architected Framework. This solution uses the following AWS CDK Constructs:
├── deployment
│ └── cdk-solution-helper [Lightweight helper that cleans-up synthesized templates from the CDK]
├── source
│ ├── bin [Entrypoint of the CDK application]
│ ├── docs [Architecture diagrams for each solution]
│ ├── labs [Templates for the Amazon MSK Labs]
│ ├── kinesis [Demo applications for the KPL and Apache Flink]
│ ├── lambda [Custom resources for features not supported by CloudFormation]
│ ├── lib [Constructs for the components of the solution]
│ ├── patterns [Stack definitions]
│ └── test [Unit tests]
You can launch this solution with one click from the solution home pages:
Please ensure you test the templates before updating any production deployments.
To customize the solution, follow the steps below:
Note: The commands listed below will build all patterns. To only include one, you can modify the CDK entrypoint file on
source/bin/streaming-data-solution.ts
git clone https://github.com/aws-solutions/streaming-data-solution-for-amazon-kinesis-and-amazon-msk
cd ./deployment
chmod +x ./run-unit-tests.sh
./run-unit-tests.sh
Note: In order to compile the solution, the build-s3 will install the AWS CDK.
ARTIFACT_BUCKET=my-bucket-name # S3 bucket name where customized code will reside
SOLUTION_NAME=my-solution-name # customized solution name
VERSION=my-version # version number for the customized code
cd ./deployment
chmod +x ./build-s3-dist.sh
./build-s3-dist.sh $ARTIFACT_BUCKET $SOLUTION_NAME $VERSION
Why doesn't the solution use CDK deploy? This solution includes a few Lambda functions, and by default CDK deploy will not install any dependencies (it'll only zip the contents of the path specified in fromAsset). In future releases, we'll look into leveraging bundling assets using Docker.
In addition to that, there are also some extra components (such as the demo applications for the KPL and Kinesis Data Analytics) that are implemented in Java, and the build-s3 script takes care of packaging them.
When creating the bucket for solution assets it is recommended to:
Note: The created bucket name must have the region where the solution is being deployed as a suffix (for example, mybucket-name-us-east-1).
aws s3 sync ./global-s3-assets s3://$ARTIFACT_BUCKET-us-east-1/$SOLUTION_NAME/$VERSION --acl bucket-owner-full-control
aws s3 sync ./regional-s3-assets s3://$ARTIFACT_BUCKET-us-east-1/$SOLUTION_NAME/$VERSION --acl bucket-owner-full-control
This solution collects anonymous operational metrics to help AWS improve the quality and features of the solution. For more information, including how to disable this capability, please see the implementation guide for the Streaming Data Solution for Amazon Kinesis and the implementation guide for the Streaming Data Solution for Amazon MSK.
Updating
, and you might see some errors when CloudFormation tries to delete resources such as AWS::KinesisAnalyticsV2::ApplicationCloudWatchLoggingOption
and Custom::VpcConfiguration
(a custom resource that configures the application to connect to a virtual private cloud).Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.