huggingface / text-generation-inference

Large Language Model Text Generation Inference
http://hf.co/docs/text-generation-inference
Apache License 2.0
8.85k stars 1.04k forks source link

Docker container for version 2.3.0 CUDA detection broken #2542

Open JoeGonzalez0886 opened 1 week ago

JoeGonzalez0886 commented 1 week ago

System Info

Running this container on multiples services produces an issue with cuda gpu detection. No gpus are detected.

Reverting back to container tagged version :2.2.0 Fixes the issue.

Just though I would post this up just in case others are usuing 2.3.0 in production, we had a automated scaling process instantiate the new container with :latest tagged and it brought down our production systems.

Please take a look this issue team.

Thank you.

Information

Tasks

Reproduction

  1. Pull latest 2.3.0 docker images
  2. Run with any LLM.
  3. Will faill to find GPU ![Uploading Screenshot_2024-09-20_at_2.20.05_PM.png…]()

Expected behavior

We would expect this version to automatically detect local GPU cuda.

antonpolishko commented 1 week ago

We ran into the same issue yesterday with our docker launching scripts using latest image tag. Looks like latest is pointing to 2.3.0-rocm tag instead of 2.3.0.

Using version based tag addressed the issue