d3b-center / d3b-data-transfer-pipeline

💡 Evaluate current data transfer pipeline. Propose improvements here
Apache License 2.0
0 stars 0 forks source link

Signal transfer is complete #10

Open znatty22 opened 1 day ago

znatty22 commented 1 day ago

Once the transfer of files from the source into the D3b staging bucket is complete, something or someone needs to notify the system so that the next step in the data transfer pipeline can execute.

Implement a command which creates an event with transfer details and fires the event to the appropriate listener system (EventBridge, SNS, SQS - do research and determine the solution in this PR)

d3b notify transfer-complete s3://bucket/AD-2806/manifests/manifest.xlsx --manifest-type=bar
znatty22 commented 9 hours ago

Chose EventBridge as event system to notify data transfer is complete.

SQS - Consumers have to poll the queue for events. This seems inefficient and unnecessary. A data transfer complete event should only be fired if / when the data transfer is complete. The consumer should receive the event via a push from the event source

SNS - Consumers subscribe to a topic and and receive events when they are pushed to the topic. This seems more viable but the subscribing to a topic seems unnecessary. We simply want to fire an event to signal the transfer is complete and then trigger the rest of the pipeline. Additionally, SNS does not allow pushing to a step function target if we wanted to directly push to the step function.

EventBridge - Allows an AWS source or external source to push an event to the event bridge. The rules define which targets receive the event. This seems like a powerful general solution for event driven systems.

https://docs.aws.amazon.com/decision-guides/latest/sns-or-sqs-or-eventbridge/sns-or-sqs-or-eventbridge.html https://aws.amazon.com/blogs/compute/choosing-between-messaging-services-for-serverless-applications/