Closed hieudx149 closed 2 years ago
Hi,
I assume that you're using our pretrained language model ViT5-base for summarization. Pretrained language models are just trained on large-scale corpora to learn general-purpose representations. They need to be finetuned on downstream tasks such as summarization to give a better representation.
The script you use is just an example. Fortunately, we also published the finetuned checkpoints for news summarization here
Hi @heraclex12, @justinphan3110 The script you guys published on hugging face make me confuse, I thought the model was fine-tuned for the summary task but it was not. Anyway, Thank you so much for your quick response.
Hi @heraclex12, As far as I understand, T5 is trained using the span-corruption objective, I tried to create a sample similar to the training data, and the result is still just a random sequence. Is there any way (or task) to try vi-T5 base without fine-tune step to see how effective T5 is ? (This problem only occurs in viT5-base version, large version seems to work well ) Span-corruption objective My code test viT5-base Result test on large version
Hi there, I have a issue with your viT5-base model on huggingface, when i run script viT5-base for task summarization, output just a random sequence, I think you uploaded the wrong model checkpoint for viT5-base.