huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
135.43k stars 27.1k forks source link

ValueError: Architecture deepseek2 not supported #34335

Open czq99972 opened 1 month ago

czq99972 commented 1 month ago

System Info

The current Transformers framework doesn't support the gguf quantized model files from deepseek2. Can you please advise when this support might be added? @SunMarc @MekkCyber

Who can help?

@SunMarc @MekkCyber

Information

Tasks

Reproduction

File "/home/work/miniforge3/envs/vllm/lib/python3.11/site-packages/transformers/models/auto/configuration_auto.py", line 1006, in from_pretrained config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/work/miniforge3/envs/vllm/lib/python3.11/site-packages/transformers/configuration_utils.py", line 570, in get_config_dict config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/work/miniforge3/envs/vllm/lib/python3.11/site-packages/transformers/configuration_utils.py", line 661, in _get_config_dict config_dict = load_gguf_checkpoint(resolved_config_file, return_tensors=False)["config"] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/work/miniforge3/envs/vllm/lib/python3.11/site-packages/transformers/modeling_gguf_pytorch_utils.py", line 103, in load_gguf_checkpoint raise ValueError(f"Architecture {architecture} not supported") ValueError: Architecture deepseek2 not supported

Expected behavior

1

VladOS95-cyber commented 1 month ago

Hey @czq99972, @SunMarc, @MekkCyber! I can take it as soon as I finish current implementation for Mamba arch, but it wouldn't be so long. I think I will be able to start working on deepseek2 on this week. Link to main issue thread: https://github.com/huggingface/transformers/issues/33260

SunMarc commented 1 month ago

Hey ! Deepspeedv2 gguf can be supported with gguf files once it is integrated in transformers: https://github.com/huggingface/transformers/pull/31976 !

wavy-jung commented 1 week ago

Any update on deepseek v2 support? @VladOS95-cyber

VladOS95-cyber commented 1 week ago

hey @wavy-jung, I see that Deepspeedv2 architecture is not supported yet, this PR #31976 is still in progress