intel-analytics / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, etc.) on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, DeepSpeed, vLLM, FastChat, Axolotl, etc.
Apache License 2.0
6.28k stars 1.23k forks source link

inference error: mistral and codellama have issue 'object has no attribute '_has_non_default_generation_parameters' #11415

Open raj-ritu17 opened 2 weeks ago

raj-ritu17 commented 2 weeks ago

GPU: 2 ARC CARD

running following example, inference-ipex-llm for mistral and codellama (working for llama2)

My guessed rank = 1
My guessed rank = 0
2024-06-24 11:32:19,965 - INFO - intel_extension_for_pytorch auto imported
2024-06-24 11:32:19,965 - INFO - intel_extension_for_pytorch auto imported
/xxxx/xxx/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/xxx/xxx/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:01<00:00,  1.12it/s]
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:01<00:00,  1.10it/s]
2024-06-24 11:32:22,839 - INFO - Converting the current model to sym_int4 format......
2024-06-24 11:32:22,907 - INFO - Converting the current model to sym_int4 format......
2024:06:24-11:32:29:(120347) |CCL_WARN| did not find MPI-launcher specific variables, switch to ATL/OFI, to force enable ATL/MPI set CCL_ATL_TRANSPORT=mpi
2024:06:24-11:32:29:(120347) |CCL_WARN| could not get local_idx/count from environment variables, trying to get them from ATL
2024:06:24-11:32:29:(120347) |CCL_WARN| sockets exchange mode is set. It may cause potential problem of 'Too many open file descriptors'
2024:06:24-11:32:29:(120346) |CCL_WARN| did not find MPI-launcher specific variables, switch to ATL/OFI, to force enable ATL/MPI set CCL_ATL_TRANSPORT=mpi
2024:06:24-11:32:29:(120346) |CCL_WARN| could not get local_idx/count from environment variables, trying to get them from ATL
2024:06:24-11:32:29:(120346) |CCL_WARN| sockets exchange mode is set. It may cause potential problem of 'Too many open file descriptors'
Traceback (most recent call last):
  File "/home/rajritu/ritu/ipex-llm/python/llm/example/GPU/Pipeline-Parallel-Inference/generate.py", line 68, in <module>
    output = model.generate(input_ids,
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/xxx/xxx/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/xxx/xxxx/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/ipex_llm/transformers/lookup.py", line 88, in generate
    return original_generate(self,
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/xxx/xxxx/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/xxxx/xxxx/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/ipex_llm/transformers/speculative.py", line 109, in generate
    return original_generate(self,
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/xxxx/xxxxx/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/xxxx/xxxxx/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/ipex_llm/transformers/pipeline_parallel.py", line 163, in generate
    and self.config._has_non_default_generation_parameters()
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/xxxx/xxxxx/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/transformers/configuration_utils.py", line 265, in __getattribute__
    return super().__getattribute__(key)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'MistralConfig' object has no attribute '_has_non_default_generation_parameters'
Traceback (most recent call last):
  File "/xxxx/ipex-llm/python/llm/example/GPU/Pipeline-Parallel-Inference/generate.py", line 68, in <module>
    output = model.generate(input_ids,
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rajritu/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/ipex_llm/transformers/lookup.py", line 88, in generate
    return original_generate(self,
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/ipex_llm/transformers/speculative.py", line 109, in generate
    return original_generate(self,
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/ipex_llm/transformers/pipeline_parallel.py", line 163, in generate
    and self.config._has_non_default_generation_parameters()
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/transformers/configuration_utils.py", line 265, in __getattribute__
    return super().__getattribute__(key)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'MistralConfig' object has no attribute '_has_non_default_generation_parameters'
[2024-06-24 11:32:32,587] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 120346) of binary: /home/rajritu/miniforge3/envs/envIPEX_LLM_INF/bin/python3.11
Traceback (most recent call last):
  File "/miniforge3/envs/envIPEX_LLM_INF/bin/torchrun", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 346, in wrapper
    return f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^
  File "/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/torch/distributed/run.py", line 806, in main
    run(args)
  File "/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/torch/distributed/run.py", line 797, in run
    elastic_launch(
  File "miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 134, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
generate.py FAILED
------------------------------------------------------------
Failures:
[1]:
  time      : 2024-06-24_11:32:32
  host      : imu-nex-sprx3-ws
  rank      : 1 (local_rank: 1)
  exitcode  : 1 (pid: 120347)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2024-06-24_11:32:32
  host      : imu-nex-sprx3-ws
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 120346)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================

same for codellama:

......
.....
AttributeError: 'LlamaConfig' object has no attribute '_has_non_default_generation_parameters'
Traceback (most recent call last):
.....
.....
orch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
generate.py FAILED
------------------------------------------------------------
Failures:
[1]:
  time      : 2024-06-24_11:34:04
  host      : imu-nex-sprx3-ws
  rank      : 1 (local_rank: 1)
  exitcode  : 1 (pid: 120992)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2024-06-24_11:34:04
  host      : imu-nex-sprx3-ws
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 120991)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
sgwhat commented 2 weeks ago

You may switch to transformers==4.37.0 and try it again.