inference error: mistral and codellama have issue 'object has no attribute '_has_non_default_generation_parameters'

GPU: 2 ARC CARD

running following example, inference-ipex-llm for mistral and codellama (working for llama2)

My guessed rank = 1
My guessed rank = 0
2024-06-24 11:32:19,965 - INFO - intel_extension_for_pytorch auto imported
2024-06-24 11:32:19,965 - INFO - intel_extension_for_pytorch auto imported
/xxxx/xxx/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/xxx/xxx/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:01<00:00,  1.12it/s]
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:01<00:00,  1.10it/s]
2024-06-24 11:32:22,839 - INFO - Converting the current model to sym_int4 format......
2024-06-24 11:32:22,907 - INFO - Converting the current model to sym_int4 format......
2024:06:24-11:32:29:(120347) |CCL_WARN| did not find MPI-launcher specific variables, switch to ATL/OFI, to force enable ATL/MPI set CCL_ATL_TRANSPORT=mpi
2024:06:24-11:32:29:(120347) |CCL_WARN| could not get local_idx/count from environment variables, trying to get them from ATL
2024:06:24-11:32:29:(120347) |CCL_WARN| sockets exchange mode is set. It may cause potential problem of 'Too many open file descriptors'
2024:06:24-11:32:29:(120346) |CCL_WARN| did not find MPI-launcher specific variables, switch to ATL/OFI, to force enable ATL/MPI set CCL_ATL_TRANSPORT=mpi
2024:06:24-11:32:29:(120346) |CCL_WARN| could not get local_idx/count from environment variables, trying to get them from ATL
2024:06:24-11:32:29:(120346) |CCL_WARN| sockets exchange mode is set. It may cause potential problem of 'Too many open file descriptors'
Traceback (most recent call last):
  File "/home/rajritu/ritu/ipex-llm/python/llm/example/GPU/Pipeline-Parallel-Inference/generate.py", line 68, in <module>
    output = model.generate(input_ids,
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/xxx/xxx/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/xxx/xxxx/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/ipex_llm/transformers/lookup.py", line 88, in generate
    return original_generate(self,
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/xxx/xxxx/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/xxxx/xxxx/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/ipex_llm/transformers/speculative.py", line 109, in generate
    return original_generate(self,
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/xxxx/xxxxx/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/xxxx/xxxxx/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/ipex_llm/transformers/pipeline_parallel.py", line 163, in generate
    and self.config._has_non_default_generation_parameters()
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/xxxx/xxxxx/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/transformers/configuration_utils.py", line 265, in __getattribute__
    return super().__getattribute__(key)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'MistralConfig' object has no attribute '_has_non_default_generation_parameters'
Traceback (most recent call last):
  File "/xxxx/ipex-llm/python/llm/example/GPU/Pipeline-Parallel-Inference/generate.py", line 68, in <module>
    output = model.generate(input_ids,
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rajritu/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/ipex_llm/transformers/lookup.py", line 88, in generate
    return original_generate(self,
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/ipex_llm/transformers/speculative.py", line 109, in generate
    return original_generate(self,
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/ipex_llm/transformers/pipeline_parallel.py", line 163, in generate
    and self.config._has_non_default_generation_parameters()
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/transformers/configuration_utils.py", line 265, in __getattribute__
    return super().__getattribute__(key)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'MistralConfig' object has no attribute '_has_non_default_generation_parameters'
[2024-06-24 11:32:32,587] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 120346) of binary: /home/rajritu/miniforge3/envs/envIPEX_LLM_INF/bin/python3.11
Traceback (most recent call last):
  File "/miniforge3/envs/envIPEX_LLM_INF/bin/torchrun", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 346, in wrapper
    return f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^
  File "/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/torch/distributed/run.py", line 806, in main
    run(args)
  File "/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/torch/distributed/run.py", line 797, in run
    elastic_launch(
  File "miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 134, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/miniforge3/envs/envIPEX_LLM_INF/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
generate.py FAILED
------------------------------------------------------------
Failures:
[1]:
  time      : 2024-06-24_11:32:32
  host      : imu-nex-sprx3-ws
  rank      : 1 (local_rank: 1)
  exitcode  : 1 (pid: 120347)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2024-06-24_11:32:32
  host      : imu-nex-sprx3-ws
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 120346)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================

same for codellama:

......
.....
AttributeError: 'LlamaConfig' object has no attribute '_has_non_default_generation_parameters'
Traceback (most recent call last):
.....
.....
orch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
generate.py FAILED
------------------------------------------------------------
Failures:
[1]:
  time      : 2024-06-24_11:34:04
  host      : imu-nex-sprx3-ws
  rank      : 1 (local_rank: 1)
  exitcode  : 1 (pid: 120992)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2024-06-24_11:34:04
  host      : imu-nex-sprx3-ws
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 120991)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

intel-analytics / ipex-llm

inference error: mistral and codellama have issue 'object has no attribute '_has_non_default_generation_parameters' #11415