vietai / ViT5

MIT License
59 stars 9 forks source link

Model Checkpoint viT5-base #1

Closed hieudx149 closed 2 years ago

hieudx149 commented 2 years ago

Hi there, I have a issue with your viT5-base model on huggingface, when i run script viT5-base for task summarization, output just a random sequence, I think you uploaded the wrong model checkpoint for viT5-base.

heraclex12 commented 2 years ago

Hi,

I assume that you're using our pretrained language model ViT5-base for summarization. Pretrained language models are just trained on large-scale corpora to learn general-purpose representations. They need to be finetuned on downstream tasks such as summarization to give a better representation.

The script you use is just an example. Fortunately, we also published the finetuned checkpoints for news summarization here

hieudx149 commented 2 years ago

Hi @heraclex12, @justinphan3110 The script you guys published on hugging face make me confuse, I thought the model was fine-tuned for the summary task but it was not. Anyway, Thank you so much for your quick response.

hieudx149 commented 2 years ago

Hi @heraclex12, As far as I understand, T5 is trained using the span-corruption objective, I tried to create a sample similar to the training data, and the result is still just a random sequence. Is there any way (or task) to try vi-T5 base without fine-tune step to see how effective T5 is ? (This problem only occurs in viT5-base version, large version seems to work well ) Span-corruption objective image My code test viT5-base image Result test on large version image