Open chardog opened 1 month ago
While loading mixtral I get "AssertionError: Insufficient space in device allocation".
Command I used "python ericLLM.py --model ./models/mistralai_Mixtral-8x7B-Instruct-v0.1 --gpu_split 24,24,24,24,24 --max_prompts 8 --num_workers 1 --gpu_balance"
I removed gpu_balance and then it loads layers across 4 out of the 5 24gb gpus but then I get a different error:
"AttributeError: 'list' object has no attribute 'current_seq_len'"
It also doesn't appear to support quantization. Is that right?
While loading mixtral I get "AssertionError: Insufficient space in device allocation".
Command I used "python ericLLM.py --model ./models/mistralai_Mixtral-8x7B-Instruct-v0.1 --gpu_split 24,24,24,24,24 --max_prompts 8 --num_workers 1 --gpu_balance"
I removed gpu_balance and then it loads layers across 4 out of the 5 24gb gpus but then I get a different error:
"AttributeError: 'list' object has no attribute 'current_seq_len'"