Open katjes733 opened 7 months ago
Unfortunately there is no way to know when the last file of a batch is uploaded. S3 emits one event for each Object Created
. Queueing the events is not possible either because there is no final signal and periodic polling on the queue may be too risky:
Therefore we will not support multi-file uploads with this ticket directly, but instead support archives. That way multiple files can be archived together and we can accomplish better upload performance due to the input files being compressed. We continue to support individual CSVs and in addition archives (tar.gz, gz and zip).
As an end user, I would like to be able to upload multiple CSV files at once and the sentiment analysis should only be triggered automatically once when all files are uploaded so that I dont have to combine mulitple CSVs into one manually before uploading.
Per #5, only one file may be uploaded at a time, as otherwise the state machine is triggered for each file (as per the EventBridge rule reacting to individual events per each file), which leads to inconsistent results.
We may need to queue the events first before triggering the state machine.