nextflow-io / nextflow

A DSL for data-driven computational pipelines
http://nextflow.io
Apache License 2.0
2.61k stars 606 forks source link

Improve stability of virtual threads at scale #4995

Open bentsherman opened 1 month ago

bentsherman commented 1 month ago

See https://nextflow.slack.com/archives/C02T98A23U7/p1715351870737659

When a workflow publishes many files using S3-to-S3 copy, virtual threads are needed to maximize request throughput, since each publish task is simply waiting on an HTTP request and doesn't need to occupy an OS thread the whole time.

However, enabling virtual threads currently creates two problems:

To this end, I propose the following improvements: