Open NamburiSrinath opened 2 months ago
Update --- The error comes from initializing torch.zeros(), below is the tracestack.
Traceback (most recent call last):
File "/home/ubuntu/Compress_Align/wanda/main.py", line 113, in <module>
main()
File "/home/ubuntu/Compress_Align/wanda/main.py", line 73, in main
prune_sparsegpt(args, model, tokenizer, device, prune_n=prune_n, prune_m=prune_m)
File "/home/ubuntu/anaconda3/envs/compress_align/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/ubuntu/Compress_Align/wanda/lib/prune.py", line 230, in prune_sparsegpt
inps = torch.zeros(
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 32.00 GiB. GPU 1 has a total capacity of 21.99 GiB of which 16.77 GiB is free. Including non-PyTorch memory, this process ha
s 5.21 GiB memory in use. Of the allocated memory 4.88 GiB is allocated by PyTorch, and 89.82 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try
setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
Upon debugging further, here are the values for print(args.nsamples, model.seqlen, model.config.hidden_size)
Mistral-7B: 128, 32768, 4096
Llama-2-7B: 128, 4096, 4096
So basically the sequence length of Mistral is very large which doesn't allow for creation of tensor.
Are there any suggestions to overcome this error?
P.S: I think this issue is similar to #51 i.e support to Mistral models
Hi,
I am trying to prune Mistral 7B (https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) and while I was able to successfully run the commands for magnitude pruning, I was facing issues with SparseGPT and Wanda.
Commands used:
python main.py --model 'mistralai/Mistral-7B-Instruct-v0.2' --prune_method sparsegpt --sparsity_ratio 0.1 --sparsity_type unstructured --save out/mistral_7b/unstructured/sparsegpt/0.1/ --save_model out/mistral_7b/unstructured/sparsegpt/0.1/
python main.py --model 'mistralai/Mistral-7B-Instruct-v0.2' --prune_method wanda --sparsity_ratio 0.1 --sparsity_type unstructured --save out/mistral_7b/unstructured/wanda/0.1/ --save_model out/mistral_7b/unstructured/wanda/0.1/
Any help here would be greatly appreciated :), tagging authors - @liuzhuang13 , @Eric-mingjie and @eltociear