Error when running Gemma inference on GPU

google / gemma_pytorch

The official PyTorch implementation of Google's Gemma models

https://ai.google.dev/gemma

Apache License 2.0

5.26k stars 503 forks source link

Error when running Gemma inference on GPU #47

Closed LarryHawkingYoung closed 5 hours ago

LarryHawkingYoung commented 6 months ago

When I run

docker run -t --rm \
    --gpus all \
    -v ${CKPT_PATH}:/tmp/ckpt \
    ${DOCKER_URI} \
    python scripts/run.py \
    --device=cuda \
    --ckpt=/tmp/ckpt \
    --variant="${VARIANT}" \
    --prompt="${PROMPT}"

It returns the error: docker: Error response from daemon: could not select device drit device driver "" with capabilities: [[gpu]].

while if I run on CPU with command:

docker run -t --rm \
    -v ${CKPT_PATH}:/tmp/ckpt \
    ${DOCKER_URI} \
    python scripts/run.py \
    --ckpt=/tmp/ckpt \
    --variant="${VARIANT}" \
    --prompt="${PROMPT}"

It works out OK.

pengchongjin commented 6 months ago

What model variant did you use and what GPU did you use?

One guess is that you may run out of GPU memory if you try to run the 7B un-quantized model on a 16GB GPU. You can either try the 7B quantized model or a 2B model and it should work.

Gopi-Uppari commented 1 week ago

Hi @LarryHawkingYoung,

Could you please confirm if this issue is resolved for you with the above comment ? Please feel free to close the issue if it is resolved ?

Thank you.

Gopi-Uppari commented 5 hours ago

Hi @LarryHawkingYoung,

Closing this issue due to lack of recent activity, Please feel free reopen if this is still a valid request.

Thank you!