ModelTC / llmc

[EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".
https://arxiv.org/abs/2405.06001
Apache License 2.0
299 stars 28 forks source link

fail to run awq on qwen2-7B #158

Open Muuut opened 4 hours ago

Muuut commented 4 hours ago

config like this:

base:
    seed: &seed 42
model:
    type: Qwen2
    path: /home/LLMCompression/model/Qwen2-7B # model path
    tokenizer_mode: slow
    torch_dtype: auto
calib:
    name: pileval
    download: False
    path: /home/LLMCompression/dataset/calib_datasets/pileval # calib data path
    n_samples: 128
    bs: -1
    seq_len: 512
    preproc: pileval_awq
    seed: *seed
eval:
    eval_pos: [pretrain, transformed, fake_quant]
    name: wikitext2
    download: False
    path: /home/LLMCompression/dataset/eval_datasets/wikitext2 # eval data path
    seq_len: 2048
    # For 7B / 13B model eval, bs can be set to "1", and inference_per_block can be set to "False".
    # For 70B model eval, bs can be set to "20", and inference_per_block can be set to "True".
    bs: 1
    inference_per_block: False
    # Consistency of tokens between original and fake-quantized model output.
    eval_token_consist: True
quant:
    method: Awq
    weight:
        bit: 8
        symmetric: True
        granularity: per_channel
        group_size: -1
    act:
        bit: 8
        symmetric: True
        granularity: per_token
    special:
        trans: True
        # The options for "trans_version" include "v1" and "v2".
        trans_version: v2
        weight_clip: True
        clip_sym: True
save:
    save_trans: False
    save_fake: False
    save_path: /home/LLMCompression/model/save/Qwen2-7B-AWQ-w8a8

get this error:

[rank0]: Traceback (most recent call last):
[rank0]:   File "/home/LLMCompression/llmc-main/llmc/__main__.py", line 271, in <module>
[rank0]:     main(config)
[rank0]:   File "/home/LLMCompression/llmc-main/llmc/__main__.py", line 27, in main
[rank0]:     model = MODEL_REGISTRY[config.model.type](
[rank0]:   File "/home/LLMCompression/llmc-main/llmc/models/qwen2.py", line 9, in __init__
[rank0]:     super().__init__(model_path, torch_dtype, device_map, use_cache)
[rank0]:   File "/home/LLMCompression/llmc-main/llmc/models/base_model.py", line 30, in __init__
[rank0]:     self.find_embed_layers()
[rank0]:   File "/home/LLMCompression/llmc-main/llmc/models/qwen2.py", line 16, in find_embed_layers
[rank0]:     self.rotary_emb = self.model.model.rotary_emb
[rank0]:   File "/root/miniconda3/envs/torch231/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1709, in __getattr__
[rank0]:     raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
[rank0]: AttributeError: 'Qwen2Model' object has no attribute 'rotary_emb'
E1025 11:15:26.387000 140316599792832 torch/distributed/elastic/multiprocessing/api.py:826] failed (exitcode: 1) local_rank: 0 (pid: 2398) of binary: /root/miniconda3/envs/torch231/bin/python
Traceback (most recent call last):
  File "/root/miniconda3/envs/torch231/bin/torchrun", line 8, in <module>
    sys.exit(main())
  File "/root/miniconda3/envs/torch231/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 347, in wrapper
    return f(*args, **kwargs)
  File "/root/miniconda3/envs/torch231/lib/python3.10/site-packages/torch/distributed/run.py", line 879, in main
    run(args)
  File "/root/miniconda3/envs/torch231/lib/python3.10/site-packages/torch/distributed/run.py", line 870, in run
    elastic_launch(
  File "/root/miniconda3/envs/torch231/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 132, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/root/miniconda3/envs/torch231/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 263, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
============================================================
/home/LLMCompression/llmc-main/llmc/__main__.py FAILED
------------------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2024-10-25_11:15:26
  host      : 11ea6d23ac9f
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 2398)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================

And there is #139. I have updated the repository but still got this error.

Harahan commented 4 hours ago

The pr #139 is for LLaMA 3.2 and has introduced some bugs for Qwen2. For a quick solution, you can employ the version before the pr. Additionally, we will fix this later.