Open avishaiElmakies opened 3 months ago
I really don't mind having better support for this 🤗 As long as we follow the way it is done for gemma
or llama
OK, I might try to do the OPT one as a first try. I should also say that i can only help with pytorch as i know nothing about jax and keras. I will look to gemma
or llama
for some inspiration.
If it works well, I might try my hand in other models.
Feature request
There are some models such that their forward pass doesn't get position_ids. e.g. we can see that OPTModel doesn't get position_ids, while GPTJModel does get position_ids. most newer models do have position_ids.
Motivation
There are two main reasons we would like for all LM models to get positions ids.
https://github.com/huggingface/transformers/blob/v4.44.1/src/transformers/modeling_flash_attention_utils.py#L270
Your contribution
I may be able to fix this and help with a PR. but would love a more experienced person to guide me.