Open frankxyy opened 1 year ago
command 1: python -m flexgen.flex_opt --model facebook/opt-30b --path _DUMMY_ --prompt-len 20 --gen-len 15 --percent 25 75 60 40 0 100 --gpu-batch-size 1 --num-gpu-batches 2 --cpu-cache-compute --debug fewer_batch peak gpu mem: 6.0679 GB
python -m flexgen.flex_opt --model facebook/opt-30b --path _DUMMY_ --prompt-len 20 --gen-len 15 --percent 25 75 60 40 0 100 --gpu-batch-size 1 --num-gpu-batches 2 --cpu-cache-compute --debug fewer_batch
command 2: python -m flexgen.flex_opt --model facebook/opt-30b --path _DUMMY_ --prompt-len 20 --gen-len 15 --percent 30 70 60 40 0 100 --gpu-batch-size 1 --num-gpu-batches 2 --cpu-cache-compute --debug fewer_batch gpu oom
python -m flexgen.flex_opt --model facebook/opt-30b --path _DUMMY_ --prompt-len 20 --gen-len 15 --percent 30 70 60 40 0 100 --gpu-batch-size 1 --num-gpu-batches 2 --cpu-cache-compute --debug fewer_batch
The only difference of command 2 from command 1 is the percentage of gpu usage of weight to increase from 25% to 30%.
The capacity of my gpu is 24 GB.
command 1:
python -m flexgen.flex_opt --model facebook/opt-30b --path _DUMMY_ --prompt-len 20 --gen-len 15 --percent 25 75 60 40 0 100 --gpu-batch-size 1 --num-gpu-batches 2 --cpu-cache-compute --debug fewer_batch
peak gpu mem: 6.0679 GBcommand 2:
python -m flexgen.flex_opt --model facebook/opt-30b --path _DUMMY_ --prompt-len 20 --gen-len 15 --percent 30 70 60 40 0 100 --gpu-batch-size 1 --num-gpu-batches 2 --cpu-cache-compute --debug fewer_batch
gpu oomThe only difference of command 2 from command 1 is the percentage of gpu usage of weight to increase from 25% to 30%.
The capacity of my gpu is 24 GB.