microsoft / ga4gh-tes

C# implementation of the GA4GH TES API; provides distributed batch task execution on Microsoft Azure
MIT License
32 stars 26 forks source link

Preview: GPU support #717

Open BMurri opened 2 months ago

BMurri commented 2 months ago

This is an initial implementation of GPU support in TES. It has some current limitations:

This follows a combination of NVIDIA, Docker, and Azure documentation for enabling containers to use GPUs. Specifically, this:

Things in the NVIDIA documentation that are NOT implemented (it's unclear what should or should not be implemented based on our use cases)

Note that the azure documentation for the VM extension for GPU support on linux points the reader to the following EULA: https://www.nvidia.com/en-us/data-center/products/nvidia-ai-enterprise/eula/

addresses microsoft/CromwellOnAzure#356

BMurri commented 2 months ago

This has been tested and it works, as per above description.

TODO in order to move from Preview to fully supported option (some of these may be debated):

I think this, as a preview, can possibly be shipped this week.

BMurri commented 5 days ago

Remaining work: Ensure that this functionality is either turned off or can be safely used if there is no public internet access (I don't know if the VM extensions can be reached from a private virtual network with no public IP access).

Bonus work: if we recognize that the image already has the drivers, just pass them through (don't try to reinstall them).