Closed TheDuckingDuck closed 1 month ago
Hey! This is just the tokenization config, hf internal testing is meant for testing. i think a checkpoint was converted by a community member here https://huggingface.co/Himetsu/pixtral-12b
Hey! This is just the tokenization config, hf internal testing is meant for testing. i think a checkpoint was converted by a community member here https://huggingface.co/Himetsu/pixtral-12b
That does get me a bit further, but the whole thing still crashes before generating with
query_states = query_states.view(bsz, q_len, self.num_heads, self.head_dim).transpose(1, 2)
RuntimeError: shape '[1, 2242, 32, 160]' is invalid for input of size 9183232
Which is not an error I feel like I am capable of debugging.
Full log:
Loading checkpoint shards: 100%|āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā| 6/6 [00:56<00:00, 9.50s/it]
Setting `pad_token_id` to `eos_token_id`:None for open-end generation.
Traceback (most recent call last):
File "F:\XComp\CogVLM2\basic_demo\hf.py", line 36, in <module>
generate_ids = model.generate(**inputs, max_new_tokens=500)
File "C:\Users\Bunner\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\utils\_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "C:\Users\Bunner\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\generation\utils.py", line 2050, in generate
result = self._sample(
File "C:\Users\Bunner\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\generation\utils.py", line 3000, in _sample
outputs = self(**model_inputs, return_dict=True)
File "C:\Users\Bunner\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "C:\Users\Bunner\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\Bunner\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\hooks.py", line 169, in new_forward
output = module._old_forward(*args, **kwargs)
File "C:\Users\Bunner\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\models\llava\modeling_llava.py", line 519, in forward
outputs = self.language_model(
File "C:\Users\Bunner\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "C:\Users\Bunner\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\Bunner\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\hooks.py", line 169, in new_forward
output = module._old_forward(*args, **kwargs)
File "C:\Users\Bunner\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\models\mistral\modeling_mistral.py", line 1033, in forward
outputs = self.model(
File "C:\Users\Bunner\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "C:\Users\Bunner\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\Bunner\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\hooks.py", line 169, in new_forward
output = module._old_forward(*args, **kwargs)
File "C:\Users\Bunner\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\models\mistral\modeling_mistral.py", line 810, in forward
layer_outputs = decoder_layer(
File "C:\Users\Bunner\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "C:\Users\Bunner\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\Bunner\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\hooks.py", line 169, in new_forward
output = module._old_forward(*args, **kwargs)
File "C:\Users\Bunner\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\models\mistral\modeling_mistral.py", line 550, in forward
hidden_states, self_attn_weights, present_key_value = self.self_attn(
File "C:\Users\Bunner\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "C:\Users\Bunner\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\Bunner\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\hooks.py", line 169, in new_forward
output = module._old_forward(*args, **kwargs)
File "C:\Users\Bunner\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\models\mistral\modeling_mistral.py", line 448, in forward
query_states = query_states.view(bsz, q_len, self.num_heads, self.head_dim).transpose(1, 2)
RuntimeError: shape '[1, 2242, 32, 160]' is invalid for input of size 9183232
Hi @IdiotSandwichTheThird, could you share your environment information and an example code snippet? I'm unable to reproduce this error using the mistral community checkpoint and the code example in the issue description
System Info
Transformers version https://github.com/huggingface/transformers/commit/8bd2b1e8c23234cd607ca8d63f53c1edfea27462
Who can help?
@ArthurZucker @amyeroberts
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Running this code
Expected behavior
To not crash with