salesforce / CodeT5

Home of CodeT5: Open Code LLMs for Code Understanding and Generation
https://arxiv.org/abs/2305.07922
BSD 3-Clause "New" or "Revised" License
2.66k stars 391 forks source link

Unrecognized configuration class from AutoModelForCausalLM.from_pretrained #100

Open aseok opened 1 year ago

aseok commented 1 year ago

Tried the ggml conversion instructions for codet5p-220m-py, codet5p-770m-py and instructcodet5p-16b facing following error:

File ".../.local/lib/python3.8/site-packages/transformers/models/auto/auto_factory.py", line 470, in from_pretrained raise ValueError( (for 220m & 770m models) ValueError: Unrecognized configuration class <class 'transformers.models.t5.configuration_t5.T5Config'> for this kind of AutoModel: AutoModelForCausalLM.

(for 16b model) ValueError: Unrecognized configuration class <class 'transformers_modules.Salesforce.instructcodet5p-16b.70bb08afa3d6f081b347e67752ca8e031a35ac4a.configuration_codet5p.CodeT5pConfig'> for this kind of AutoModel: AutoModelForCausalLM.

Model type should be one of BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BlenderbotConfig, BlenderbotSmallConfig, BloomConfig, CamembertConfig, CodeGenConfig, CpmAntConfig, CTRLConfig, Data2VecTextConfig, ElectraConfig, ErnieConfig, GitConfig, GPT2Config, GPT2Config, GPTBigCodeConfig, GPTNeoConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GPTJConfig, LlamaConfig, MarianConfig, MBartConfig, MegaConfig, MegatronBertConfig, MvpConfig, OpenLlamaConfig, OpenAIGPTConfig, OPTConfig, PegasusConfig, PLBartConfig, ProphetNetConfig, QDQBertConfig, ReformerConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerConfig, RwkvConfig, Speech2Text2Config, TransfoXLConfig, TrOCRConfig, XGLMConfig, XLMConfig, XLMProphetNetConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetConfig, XmodConfig.

yuewang-cuhk commented 1 year ago

Hi there, as our CodeT5+ is a family of encoder-decoder based LLMs, the correct auto class to use would be AutoModelForSeq2SeqLM instead of AutoModelForCausalLM.