Open bentsherman opened 5 months ago
Since the S3 filesystem will need to be rewritten, it will also be a good opportunity to improve the performance (i.e. throughput).
It looks like AWS has developed their own S3 filesystem: https://github.com/awslabs/aws-java-nio-spi-for-s3
So we might be able to just use it. We would likely still need to wrap it in our own "delegating" filesystem so that we can add custom behavior (see the S3Path
class for details). I have done a similar thing in #4729 for the GCS filesystem.
Another ticket came up, SES v1 limits emails to 10MB whereas SES v2 limit is 40MB.
Would be great to switch over to SES v2 as early as feasible from the NF development side
Hi @bentsherman - wondering if there are any updates here. Thanks in advance.
Nextflow currently uses the AWS Java SDK v1 which is reaching end of life.
Additionally, new features are only being added to SDK v2, which will make it difficult to adopt new AWS features in the future. We found a way to support SSO authentication with some adaptor class, but other changes might not be so feasible.
The main components we use are AWS Batch, S3, and of course credentials. I don't believe the AWS Batch piece has changed much, but according to Paolo, the file transfer API is very different, and our S3 filesystem is easily the most complex piece of our AWS integration.
Another major change is that the v2 client can only work with a single region whereas the v1 client is cross-region. Supposedly this should not be challenging to implement anymore.