Open gcelano opened 1 month ago
Hi,
Thank you very much for opening the issue. AutoModelForConditionalGeneration
does indeed seem to be outdated, good catch!
The fundamental problem you are having is that GreTa (or T5 in general) is an encoder-decoder model, where the encoder and decoder each expect different input_ids. You can see the inputs for the encoder here, and for the decoder here.
What a useful example demonstrating how the model should be used looks like of course depends on the use case. For example, to do inference for machine translation, you could do something like this:
from transformers import AutoTokenizer, T5ForConditionalGeneration
tokenizer = AutoTokenizer.from_pretrained("bowphs/ancient-t5-translation")
model = T5ForConditionalGeneration.from_pretrained("bowphs/ancient-t5-translation")
input_ids = tokenizer("translate english to greek: the man took the bowl with the intention of drinking wine.", return_tensors="pt").input_ids
outputs = model.generate(input_ids, num_beams=3)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
This example accesses a fine-tuned variant of the model. I have a fine-tuning script for lemmatization here. In general, GreTa should work with any code that also works for the original T5 model, so the "canonical" HuggingFace notebooks should also be good pointers. If you have any further questions or have a different use case in mind, don't hesitate to follow up.
I m trying to use GreTa following the commands here https://huggingface.co/bowphs/GreTa , but it does not work.
AutoModelForConditionalGeneration
seems to have been substituted withT5ForConditionalGeneration
(using transformers 4.42.4), but, most importantly, it is not clear how to use the model:This returns the error
The following keyword arguments are not supported by this model: ['token_type_ids']
If I adapt the following example (https://huggingface.co/docs/transformers/model_doc/t5#transformers.T5Model.forward)
it works, but it is not clear what
decoder_input_ids
should correspond to and how to pass the masks.Can you, please, provide an example of the use of the model?