Open Talib6509 opened 1 year ago
I'm also experiencing this issue. Can you please provide guidance on how we can determine the maximum input length of text to pass into the model? @PrithivirajDamodaran
Thank you.
Have you tried normalizing your input text, e.g. with input.capitalize()
?
The sentencepiece tokenizer junks rare words in many small parts, especially if they are uppercase and regular not uppercase.
The model is having trouble with long sentences. Specially if the words in the sentences are in upper case. It outputs only limited sentence as an output and the rest neglected sentence is shown as error.