nextflow-io / nextflow

A DSL for data-driven computational pipelines
http://nextflow.io
Apache License 2.0
2.66k stars 618 forks source link

Support for confidential compute in GCP #5140

Open Puumanamana opened 1 month ago

Puumanamana commented 1 month ago

Hi,

Google Cloud recently released a feature to enable Confidential Computing. The goal is to "protect the confidentiality of data in the cloud by encrypting data-in-use while it’s being processed". It's a security practice they recommend and in my case, is enforced at the organization level. Since VMs scheduled through Nextflow don't use this feature by default, it causes my workflows to fail. Currently, there is no way to set that up in the nextflow.config. @ejseqera proposed a nice workaround, which consists in using instance templates with confidential computing enabled. However it requires setting that up for every different type of compute we have (since, IIUC, cpu/memory/disk directives are not compatible with the instance template directive). I'm also not sure of how that would play along with retry strategies that increase resources. Finally, it also means managing all these instance templates along with resource requirement evolution, which is not ideal.

Would it be possible to include that feature as a GCP specific config? Something like:

gcp {
    enableConfidentialComputing = true
}
siddharthab commented 1 month ago

This is currently not supported by Google Cloud Batch, which Nextflow relies on. Instance Templates are your only way for now.

I submitted a similar feature request for Shielded VMs (issue tracker link).