It would be better (regarding costs of EFS) that EFS purpose is just to store small to medium files to share between instances and docker containers. Each instance should have it's own volume to write their results and finally upload each of them to S3, this imply that in a next step ahead of pipeline of classification each instance would need to download from S3 to it's own volume for reading. This would avoid bottle necks of writing/reading to/from EFS and thereby benefit to scale medium-big clusters and finally save costs (S3 is cheaper than EFS).
Observe that nothing supports reading netcdf from S3 using opendatacube
Pipeline of classification writes intermediary results to EFS, for instance:
It would be better (regarding costs of EFS) that EFS purpose is just to store small to medium files to share between instances and docker containers. Each instance should have it's own volume to write their results and finally upload each of them to S3, this imply that in a next step ahead of pipeline of classification each instance would need to download from S3 to it's own volume for reading. This would avoid bottle necks of writing/reading to/from EFS and thereby benefit to scale medium-big clusters and finally save costs (S3 is cheaper than EFS).
Observe that nothing supports reading netcdf from S3 using opendatacube