microsoft / DeepSpeed-MII

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
Apache License 2.0
1.91k stars 175 forks source link

Deepspeed mii library issues #530

Closed gayatripadmani closed 3 weeks ago

gayatripadmani commented 1 month ago

i tried the Deepspeed mii library for create a pipeline with jupiter cuda compatibility score 8.0+ but it’s give me error :

OutOfMemoryError: CUDA out of memory. Tried to allocate 19.75 GiB. GPU 0 has a total capacity of 22.03 GiB of which 19.51 GiB is free. Including non-PyTorch memory, this process has 2.52 GiB memory in use. Of the allocated memory 1.35 GiB is allocated by PyTorch, and 5.34 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (CUDA semantics — PyTorch 2.4 documentation)

if know any one how to solve this error please help me.

loadams commented 1 month ago

Hi @gayatripadmani - you're running out of memory on your device, can you share what model you are using? Or can you try with a smaller model or with more DeepSpeed optimizations (what zero level are you running with in your ds_config)?

loadams commented 3 weeks ago

Hi @gayatripadmani - I'm going to close this for being stale. Apologies for being slow to reply - but please comment if you need us to re-open this.