Open richard-jones opened 3 years ago
We should add this as a Kafka topic
I have implemented this as the first step of the pipeline processor, written over an abstract storage interface that can connect to the local disk or to Amazon S3.
I have pushed this to develop but I have not yet wired it into the live processing pipeline as we need to consider the testing storage implementation a bit more carefully.
I propose to add a more advanced local storage layer which can persist the files to ZIP instead of as individual files, which should make this scale better in the local store mode. For real implementations the service provider will need to use S3 or provide a storage implementation which persists to their preferred object store. All that can be configured in.
Tagging @Steven-Eardley into this one as I need to discuss how it fits into the orchestration and deployment