X-PLUG / mPLUG-DocOwl

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
Apache License 2.0
1.54k stars 101 forks source link

Getting TypeError: LlamaDecoderLayer.__init__() takes 2 positional arguments but 3 were given #42

Closed MistWolf27379 closed 7 months ago

MistWolf27379 commented 7 months ago

When I run the inference code: from docowl_infer import DocOwlInfer model_path='./mPLUG/DocOwl1.5-chat' docowl=DocOwlInfer(ckpt_path=model_path, anchors='grid_9', add_global_img=True) print('load model from ', model_path)

I am getting

TypeError Traceback (most recent call last) Cell In[3], line 1 ----> 1 docowl=DocOwlInfer(ckpt_path=model_path, anchors='grid_9', add_global_img=True)

File c:\Users\internanirudh\Desktop\DocOwl\mPLUG-DocOwl-main\DocOwl1.5\docowl_infer.py:21, in DocOwlInfer.init(self, ckpt_path, anchors, add_global_img, load_8bit, load_4bit) 19 ic(model_name) 20 print("DocOwl Infer ") ---> 21 self.tokenizer, self.model, , = load_pretrained_model(ckpt_path, None, model_name, load_8bit=load_8bit, load_4bit=load_4bit, device="cuda") 22 self.doc_image_processor = DocProcessor(image_size=448, anchors=anchors, add_global_img=add_global_img, add_textual_crop_indicator=True) 23 self.streamer = TextStreamer(self.tokenizer, skip_prompt=True, skip_special_tokens=True)

File c:\Users\internanirudh\Desktop\DocOwl\mPLUG-DocOwl-main\DocOwl1.5\mplug_docowl\model\builder.py:54, in load_pretrained_model(model_path, model_base, model_name, load_8bit, load_4bit, device_map, device) 52 tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=False) 53 print("MPLUGDocOwlLlamaForCausalLM") ---> 54 model = MPLUGDocOwlLlamaForCausalLM.from_pretrained(model_path, low_cpu_mem_usage=True, **kwargs) 55 else: 56 # Load language model 57 if model_base is not None: 58 # PEFT model

File c:\Users\internanirudh\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\modeling_utils.py:3405, in PreTrainedModel.from_pretrained(cls, pretrained_model_name_or_path, config, cache_dir, ignore_mismatched_sizes, force_download, local_files_only, token, revision, use_safetensors, *model_args, *kwargs) 3402 with ContextManagers(init_contexts): 3403 # Let's make sure we don't run the init function of buffer modules 3404 print("ContexManager") -> 3405 model = cls(config, model_args, **model_kwargs) 3407 # make sure we use the model's config since the init call might have copied it 3408 config = model.config

File c:\Users\internanirudh\Desktop\DocOwl\mPLUG-DocOwl-main\DocOwl1.5\mplug_docowl\model\modeling_mplug_docowl.py:209, in MPLUGDocOwlLlamaForCausalLM.init(self, config) 207 def init(self, config): 208 super(LlamaForCausalLM, self).init(config) --> 209 self.model = MPLUGDocOwlLlamaModel(config) 211 self.lm_head = nn.Linear(config.hidden_size, config.vocab_size, bias=False) 213 # Initialize weights and apply final processing

File c:\Users\internanirudh\Desktop\DocOwl\mPLUG-DocOwl-main\DocOwl1.5\mplug_docowl\model\modeling_mplug_docowl.py:201, in MPLUGDocOwlLlamaModel.init(self, config) 200 def init(self, config: MPLUGDocOwlConfig): --> 201 super(MPLUGDocOwlLlamaModel, self).init(config)

File c:\Users\internanirudh\Desktop\DocOwl\mPLUG-DocOwl-main\DocOwl1.5\mplug_docowl\model\modeling_mplug_docowl.py:33, in MPLUGDocOwlMetaModel.init(self, config) 32 def init(self, config): ---> 33 super(MPLUGDocOwlMetaModel, self).init(config) 34 self.vision_model = MplugOwlVisionModel( 35 MplugOwlVisionConfig(config.visual_config["visual_model"]) 36 ) 38 self.vision2text = MplugDocOwlHReducerModel( 39 MplugDocOwlHReducerConfig(config.visual_config["visual_hreducer"]), config.hidden_size 40 )

File c:\Users\internanirudh\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\models\llama\modeling_llama.py:926, in LlamaModel.init(self, config) 923 self.embed_tokens = nn.Embedding(config.vocab_size, config.hidden_size, self.padding_idx) 924 print("LlamaDecoderLayer Start") 925 self.layers = nn.ModuleList( --> 926 [LlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)] 927 ) 928 print("LlamaDecoderLayer Ran") 929 self.norm = LlamaRMSNorm(config.hidden_size, eps=config.rms_norm_eps)

File c:\Users\internanirudh\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\models\llama\modeling_llama.py:926, in (.0) 923 self.embed_tokens = nn.Embedding(config.vocab_size, config.hidden_size, self.padding_idx) 924 print("LlamaDecoderLayer Start") 925 self.layers = nn.ModuleList( --> 926 [LlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)] 927 ) 928 print("LlamaDecoderLayer Ran") 929 self.norm = LlamaRMSNorm(config.hidden_size, eps=config.rms_norm_eps)

TypeError: LlamaDecoderLayer.init() takes 2 positional arguments but 3 were given

Can y'all give me a solution to this problem

HAWLYQ commented 7 months ago

Hi, @MistWolf27379 , try transformers==4.31.0

MistWolf27379 commented 7 months ago

It worked Thanks !!!