GoogleCloudPlatform / vertex-ai-samples

Notebooks, code samples, sample apps, and other resources that demonstrate how to use, develop and manage machine learning and generative AI workflows using Google Cloud Vertex AI.
https://cloud.google.com/vertex-ai
Apache License 2.0
119 stars 28 forks source link

Deployed custom container to vertex but container is unable to access gpu #3250

Closed pulkitmehtaworkmetacube closed 1 month ago

pulkitmehtaworkmetacube commented 4 months ago

Expected Behavior

Container should be able to acces GPU device .

Actual Behavior

Container is not able to access GPU device.

Steps to Reproduce the Problem

We deployed a custom container to vertex ai , it has prebuilt torch GPU container us-docker.pkg.dev/vertex-ai/training/pytorch-gpu.1-13.py310:latest as base image but when we deploy it to vertex using n1-highmem-8 machine and tesla t4 gpu , container is not able to access GPU , device is still CPU . Please guide .

Specifications

n1-highmem-8 tesla-t4 gpu

gericdong commented 1 month ago

Thanks for reporting the issue. For general assistance, please contact support or use https://cloud.google.com/support/docs/issue-trackers.