DataBiosphere / dsub

Open-source command-line tool to run batch computing tasks and workflows on backend services such as Google Cloud.
Apache License 2.0
265 stars 44 forks source link

Specifying nvme interface for local_ssd #214

Closed gsneha26 closed 3 years ago

gsneha26 commented 4 years ago

Is there some way to specify using nvme interface with --disk-type local_ssd ?

mbookman commented 4 years ago

Hi @gsneha26 !

This is an interesting question. We have not seen a lot of use case for local SSDs with dsub and batch style workflows. In general, PD standard has been the best choice, notably when cost is considered. We added the --disk-type flag as enough people asked about PD SSD; even then a large PD standard is almost always the better choice.

Can you share more about the type of task you are looking to enable and any validation done on a GCE VM that your workflow would actually benefit from local SSD (including the cost trade-off)?

I don't see any indication that the Pipelines API includes a flag for specifying to use an NVMe interface:

The right place to inquire further on that would be the the gcp-life-sciences-discuss Google group as noted here:

Once they have plumbed through a flag for the disks interface (which is in the GCE API):

we could plumb a flag through dsub.

-Matt