ndif-team / nnsight

The nnsight package enables interpreting and manipulating the internals of deep learned models.
https://nnsight.net/
MIT License
356 stars 34 forks source link

Unrecognized configuration class when using nnsight to load new LLMs #211

Open ruizheliUOA opened 2 weeks ago

ruizheliUOA commented 2 weeks ago

I am trying to use nnsight to load new LLMs, such as Qwen/Qwen2-VL-7B-Instruct.

qwen2_vl_model = LanguageModel("Qwen/Qwen2-VL-7B-Instruct", device_map="auto", dispatch=True)

The nnsight reported the ValueError: Unrecognized configuration class <class 'transformers.models.qwen2_vl.configuration_qwen2_vl.Qwen2VLConfig'> for this kind of AutoModel: AutoModelForCausalLM. Model type should be one of BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BlenderbotConfig, BlenderbotSmallConfig, BloomConfig, CamembertConfig, LlamaConfig, CodeGenConfig, CohereConfig, CpmAntConfig, CTRLConfig, Data2VecTextConfig, DbrxConfig, ElectraConfig, ErnieConfig, FalconConfig, FalconMambaConfig, FuyuConfig, GemmaConfig, Gemma2Config, GitConfig, GPT2Config, GPT2Config, GPTBigCodeConfig, GPTNeoConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GPTJConfig, GraniteConfig, JambaConfig, JetMoeConfig, LlamaConfig, MambaConfig, Mamba2Config, MarianConfig, MBartConfig, MegaConfig, MegatronBertConfig, MistralConfig, MixtralConfig, MptConfig, MusicgenConfig, MusicgenMelodyConfig, MvpConfig, NemotronConfig, OlmoConfig, OpenLlamaConfig, OpenAIGPTConfig, OPTConfig, PegasusConfig, PersimmonConfig, PhiConfig, Phi3Config, PLBartConfig, ProphetNetConfig, QDQBertConfig, Qwen2Config, Qwen2MoeConfig, RecurrentGemmaConfig, ReformerConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerConfig, RwkvConfig, Speech2Text2Config, StableLmConfig, Starcoder2Config, TransfoXLConfig, TrOCRConfig, WhisperConfig, XGLMConfig, XLMConfig, XLMProphetNetConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetConfig, XmodConfig.

Is there any method to load such new LLMs and use nnsight to edit and analyse those new LLMs?

JadenFiotto-Kaufman commented 2 weeks ago

@ruizheliUOA Try installing transformers directly from github?

ruizheliUOA commented 2 weeks ago

I install transformers using this way, but the same error happens again.

pip install git+https://github.com/huggingface/transformers

JadenFiotto-Kaufman commented 2 weeks ago

@ruizheliUOA Also try this:

qwen2_vl_model = LanguageModel("Qwen/Qwen2-VL-7B-Instruct", device_map="auto", dispatch=True, automodel="AutoModelForSeq2SeqLM")

The base LanguageModel uses AutoModelForCausalLM but looking at the HF page https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct you can click "use in transformers" in the top right. Never used a model like this personally so I'm unsure the the format of LangugeModel (tokenization etc) would work out of the box. If this dosent work easily I would either:

Just load the model outside of nnsight using AutoModelForSeq2SeqLM and wrap it in NNsight

from nnsight import NNsight
model = AutoModelForSeq2SeqLM(...)
model = NNsight(model)

Of course youll have to do any tokenization/preprocessing yourself and pass it to the model

Or 2.)

make your own subclass of NNsight that fits whatever type of model this is and if it seems like its general enough, I'll add it to nnsight :)

ruizheliUOA commented 2 weeks ago

@JadenFiotto-Kaufman Many thanks for your help!

qwen2_vl_model = LanguageModel("Qwen/Qwen2-VL-7B-Instruct", device_map="auto", dispatch=True, automodel="AutoModelForSeq2SeqLM") works

Looking forward to the second way to add to nnsight.