mustafaaljadery / gemma-2B-10M

Gemma 2B with 10M context length using Infini-attention.
948 stars 60 forks source link

TypeError: GemmaModel.forward() got an unexpected keyword argument 'cache_position' #5

Open DewEfresh opened 6 months ago

DewEfresh commented 6 months ago

I made a colab( https://colab.research.google.com/drive/1Z3NdoT0WS8KXnSUS3_xxT39NBZD6eGcN?usp=sharing ) to test and I ran into some issue. GemmaModel.forward() got an unexpected keyword argument 'cache_position'. I had to change some of the main.py to get the model to load correctly. The model loads into system ram not onto the gpu, I don't know if that is the issue for the GemmaModel.forward() error.

I have some other question, is the content length set in the def generate function? Is the memory ballooning as the context and hidden state grows? In the config.json "torch_dtype" is "float32" is there a reason for this, in google gemma2b its "torch_dtype" is "bfloat16".


TypeError Traceback (most recent call last)

in () 2 3 with torch.no_grad(): ----> 4 generated_text = generate( 5 model, tokenizer, prompt_text, max_length=512, temperature=0.8 6 ) 5 frames in generate(model, tokenizer, prompt_text, max_length, temperature) 15 while generated_sequence.size(1) < original_length + max_length: 16 input_segment = generated_sequence[:, -2048:] ---> 17 outputs = model(input_ids=input_segment.to(model.device), memory=memory, norm_term=norm_term) 18 memory, norm_term = outputs.memory, outputs.norm_term 19 next_token_logits = outputs.logits[:, -1, :] /usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in _wrapped_call_impl(self, *args, **kwargs) 1509 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc] 1510 else: -> 1511 return self._call_impl(*args, **kwargs) 1512 1513 def _call_impl(self, *args, **kwargs): /usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs) 1518 or _global_backward_pre_hooks or _global_backward_hooks 1519 or _global_forward_hooks or _global_forward_pre_hooks): -> 1520 return forward_call(*args, **kwargs) 1521 1522 try: /content/gemma-2B-10M/src/gemma.py in forward(self, input_ids, attention_mask, position_ids, past_key_values, inputs_embeds, labels, use_cache, output_attentions, output_hidden_states, return_dict, cache_position, memory, norm_term, no_memory_update) 947 ) 948 --> 949 outputs = self.model( 950 input_ids=input_ids, 951 attention_mask=attention_mask, /usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in _wrapped_call_impl(self, *args, **kwargs) 1509 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc] 1510 else: -> 1511 return self._call_impl(*args, **kwargs) 1512 1513 def _call_impl(self, *args, **kwargs): /usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs) 1518 or _global_backward_pre_hooks or _global_backward_hooks 1519 or _global_forward_hooks or _global_forward_pre_hooks): -> 1520 return forward_call(*args, **kwargs) 1521 1522 try: TypeError: GemmaModel.forward() got an unexpected keyword argument 'cache_position'
katsu-chan commented 6 months ago

Could you share changes to main.py, please?

katsu-chan commented 6 months ago

As for loading in cpu ram instead of gpu ram, it's probably because pytorch version is incorrect

DewEfresh commented 6 months ago

Could you share changes to main.py, please?

model_path = "./models/models--mustafaaljadery--gemma-2B-10M"

tokenizer = AutoTokenizer.from_pretrained(model_path)

tokenizer = AutoTokenizer.from_pretrained(model_name, cache_dir="./models") model = GemmaForCausalLM.from_pretrained(

model_path,

model_name, cache_dir="./models",
torch_dtype=torch.bfloat16

)

Aniforka commented 6 months ago

Не могли бы вы поделиться изменениями в main.py?

model_path = "./models/models--mustafaaljadery--gemma-2B-10M" #tokenizer = AutoTokenizer.from_pretrained(model_path) tokenizer = AutoTokenizer.from_pretrained(model_name, cache_dir="./models") model = GemmaForCausalLM.from_pretrained( #model_path, model_name, cache_dir="./models", torch_dtype=torch.bfloat16 )

Does it work for you? I have errors after errors

DewEfresh commented 6 months ago

i can't get past GemmaModel.forward() got an unexpected keyword argument 'cache_position'

Aniforka commented 6 months ago

i can't get past GemmaModel.forward() got an unexpected keyword argument 'cache_position'

I "solved" this problem, but it turned out not to be the end look at my issue of Some errors

drdsgvo commented 6 months ago

Same problem.

huliangbing commented 6 months ago

Me too.

world4jason commented 6 months ago

after solving lots of "unexpected keyword argument" i got RuntimeError: The size of tensor a (8) must match the size of tensor b (9) at non-singleton dimension 3 so tired of code full of bugs"