kast424 commented 6 months ago

I am trying to prune with python main.py \ --model mistralai/Mistral-7B-Instruct-v0.2 \ --prune_method wanda \ --sparsity_ratio 0.5 \ --sparsity_type unstructured \ --save out/mistral_7b/unstructured/wanda/ and the output is as below. torch 2.3.0 transformers 4.41.0.dev0 accelerate 0.31.0.dev0

of gpus: 2

loading llm model mistralai/Mistral-7B-Instruct-v0.2 ^MLoading checkpoint shards: 0%| | 0/3 [00:00<?, ?it/s]^MLoading checkpoint shards: 33%|███▎ | 1/3 [00:30<01:01, 30.79s/it]^MLoading checkpoint shards: 67%|██████▋ | 2/3 [00:46<00:21,$ use device cuda:0 pruning starts loading calibdation data dataset loading complete Traceback (most recent call last): File "/mnt/parscratch/users/acq22stk/teamproject/wanda/main.py", line 110, in main() File "/mnt/parscratch/users/acq22stk/teamproject/wanda/main.py", line 69, in main prune_wanda(args, model, tokenizer, device, prune_n=prune_n, prune_m=prune_m) File "/mnt/parscratch/users/acq22stk/teamproject/wanda/lib/prune.py", line 144, in prune_wanda inps, outs, attention_mask, position_ids = inps.to(dev), outs.to(dev), attention_mask.to(dev), position_ids.to(dev) AttributeError: 'NoneType' object has no attribute 'to'

Eric-mingjie commented 6 months ago

It seems you are using the mistral model, for a model with different definition file (https://github.com/huggingface/transformers/blob/v4.40.0/src/transformers/models/mistral/modeling_mistral.py), the codebase needs to be adapted.

kast424 commented 6 months ago

I could see magnitude pruning work on the same mistral model.

Shinning-Zhou commented 3 months ago

It seems you are using the mistral model, for a model with different definition file (https://github.com/huggingface/transformers/blob/v4.40.0/src/transformers/models/mistral/modeling_mistral.py), the codebase needs to be adapted.

Same. So where should we change to accommodate models with different structures?

NamburiSrinath commented 2 months ago

Hi @Shinning-Zhou and @kast424,

Did you figure out how to work around this issue? I am facing similar error when working with llama-2-7b-chat (#67)

Shinning-Zhou commented 2 months ago

Hi @Shinning-Zhou and @kast424,

Did you figure out how to work around this issue? I am facing similar error when working with llama-2-7b-chat (#67)

I removed the attention_mask variable because llama doesn't have it

NamburiSrinath commented 2 months ago

Thanks, that resolved the issue :)

NamburiSrinath commented 1 month ago

Hi @Shinning-Zhou, @kast424

I am trying to run this repo on Mistral but am facing a different error, is there a fix for this yet i.e can we prune using SparseGPT and Wanda on Mistral!

Here's the issue in more detail - #68

locuslab / wanda

AttributeError: 'NoneType' object has no attribute 'to' #51

of gpus: 2