If you are beginning your journey with Senzing, please start with Senzing Quick Start guides.
You are in the Senzing Garage where projects are "tinkered" on. Although this GitHub repository may help you understand an approach to using Senzing, it's not considered to be "production ready" and is not considered to be part of the Senzing product. Heck, it may not even be appropriate for your application of Senzing!
Populate a queue with records to be consumed by stream-loader.
The stream-produder.py python script reads files of different formats
(JSON, CSV, Parquet, Avro) and publishes it to a queue (RabbitMQ, Kafka, AWS SQS).
The senzing/stream-producer
docker image is a wrapper for use in docker formations (e.g. docker-compose, kubernetes).
To see all of the subcommands, run:
$ ./stream-producer.py --help
usage: stream-producer.py [-h]
{avro-to-kafka,avro-to-rabbitmq,avro-to-sqs,avro-to-sqs-batch,avro-to-stdout,csv-to-kafka,csv-to-rabbitmq,csv-to-sqs,csv-to-sqs-batch,csv-to-stdout,gzipped-json-to-kafka,gzipped-json-to-rabbitmq,gzipped-json-to-sqs,gzipped-json-to-sqs-batch,gzipped-json-to-stdout,json-to-kafka,json-to-rabbitmq,json-to-sqs,json-to-sqs-batch,json-to-stdout,parquet-to-kafka,parquet-to-rabbitmq,parquet-to-sqs,parquet-to-sqs-batch,parquet-to-stdout,websocket-to-kafka,websocket-to-rabbitmq,websocket-to-sqs,websocket-to-sqs-batch,websocket-to-stdout,sleep,version,docker-acceptance-test}
...
Queue messages. For more information, see https://github.com/Senzing/stream-
producer
positional arguments:
{avro-to-kafka,avro-to-rabbitmq,avro-to-sqs,avro-to-sqs-batch,avro-to-stdout,csv-to-kafka,csv-to-rabbitmq,csv-to-sqs,csv-to-sqs-batch,csv-to-stdout,gzipped-json-to-kafka,gzipped-json-to-rabbitmq,gzipped-json-to-sqs,gzipped-json-to-sqs-batch,gzipped-json-to-stdout,json-to-kafka,json-to-rabbitmq,json-to-sqs,json-to-sqs-batch,json-to-stdout,parquet-to-kafka,parquet-to-rabbitmq,parquet-to-sqs,parquet-to-sqs-batch,parquet-to-stdout,websocket-to-kafka,websocket-to-rabbitmq,websocket-to-sqs,websocket-to-sqs-batch,websocket-to-stdout,sleep,version,docker-acceptance-test}
Subcommands (SENZING_SUBCOMMAND):
avro-to-kafka Read Avro file and send to Kafka.
avro-to-rabbitmq Read Avro file and send to RabbitMQ.
avro-to-sqs Read Avro file and print to AWS SQS.
avro-to-stdout Read Avro file and print to STDOUT.
csv-to-kafka Read CSV file and send to Kafka.
csv-to-rabbitmq Read CSV file and send to RabbitMQ.
csv-to-sqs Read CSV file and print to SQS.
csv-to-stdout Read CSV file and print to STDOUT.
gzipped-json-to-kafka Read gzipped JSON file and send to Kafka.
gzipped-json-to-rabbitmq Read gzipped JSON file and send to RabbitMQ.
gzipped-json-to-sqs Read gzipped JSON file and send to AWS SQS.
gzipped-json-to-stdout Read gzipped JSON file and print to STDOUT.
json-to-kafka Read JSON file and send to Kafka.
json-to-rabbitmq Read JSON file and send to RabbitMQ.
json-to-sqs Read JSON file and send to AWS SQS.
json-to-stdout Read JSON file and print to STDOUT.
parquet-to-kafka Read Parquet file and send to Kafka.
parquet-to-rabbitmq Read Parquet file and send to RabbitMQ.
parquet-to-sqs Read Parquet file and print to AWS SQS.
parquet-to-stdout Read Parquet file and print to STDOUT.
sleep Do nothing but sleep. For Docker testing.
version Print version of program.
docker-acceptance-test For Docker acceptance testing.
optional arguments:
-h, --help show this help message and exit
At Senzing, we strive to create GitHub documentation in a "don't make me think" style. For the most part, instructions are copy and paste. Whenever thinking is needed, it's marked with a "thinking" icon :thinking:. Whenever customization is needed, it's marked with a "pencil" icon :pencil2:. If the instructions are not clear, please let us know by opening a new Documentation issue describing where we can improve. Now on with the show...
Run Docker container. This command will show help. Example:
docker run \
--rm \
senzing/stream-producer --help
For more examples of use, see Examples of Docker.
Deploy the Backing Services required by the Stream Loader.
Specify a directory to place artifacts in. Example:
export SENZING_VOLUME=~/my-senzing
mkdir -p ${SENZING_VOLUME}
Download docker-compose.yaml
file.
Example:
curl -X GET \
--output ${SENZING_VOLUME}/docker-compose.yaml \
https://raw.githubusercontent.com/Senzing/stream-producer/main/docker-compose.yaml
Bring up docker-compose stack. Example:
docker-compose -f ${SENZING_VOLUME}/docker-compose.yaml up
:thinking: The following tasks need to be complete before proceeding. These are "one-time tasks" which may already have been completed.
Install Python prerequisites. Example:
pip3 install -r https://raw.githubusercontent.com/Senzing/stream-producer/main/requirements.txt
Get a local copy of stream-producer.py. Example:
:pencil2: Specify where to download file. Example:
export SENZING_DOWNLOAD_FILE=~/stream-producer.py
Download file. Example:
curl -X GET \
--output ${SENZING_DOWNLOAD_FILE} \
https://raw.githubusercontent.com/Senzing/stream-producer/main/stream-producer.py
Make file executable. Example:
chmod +x ${SENZING_DOWNLOAD_FILE}
:thinking: Alternative: The entire git repository can be downloaded by following instructions at Clone repository
Run the command. Example:
${SENZING_DOWNLOAD_FILE} --help
For more examples of use, see Examples of CLI.
Configuration values specified by environment variable or command line parameter.
stream-producer.py uses
AWS SDK for Python (Boto3)
to access AWS services.
This library may be configured via environment variables or ~/.aws/config
file.
Example environment variables for configuration: