j-min / VL-T5

PyTorch code for "Unifying Vision-and-Language Tasks via Text Generation" (ICML 2021)
https://arxiv.org/abs/2102.02779
MIT License
357 stars 56 forks source link

Generator-based model work better than the classifier-based model? #4

Closed hackerchenzhuo closed 3 years ago

hackerchenzhuo commented 3 years ago

Hi! First of all, thanks for such a great work : )

I just read your reply on #1 . It is interesting.

you won't need to modify the vocabulary since T5's tokenizer is based on sentencepiece trained on a large corpus. So, I have a question and I was wondering wether the generator-based model work better than the classifier-based model? (like VQA task) What do you think about this view?

j-min commented 3 years ago

I won't say the generator-based method is always better than the classifier-based method. The performance could be different depending on hyperparameters, such as how to tokenize the text and how to create labels. However, I would say our generative method could be easier to use for many tasks. In our paper, we empirically show that the generative method could be at least good baselines, sometimes outperforming existing discriminative methods in many tasks.

hackerchenzhuo commented 3 years ago

I won't say the generator-based method is always better than the classifier-based method. The performance could be different depending on hyperparameters, such as how to tokenize the text and how to create labels. However, I would say our generative method could be easier to use for many tasks. In our paper, we empirically show that the generative method could be at least good baselines, sometimes outperforming existing discriminative methods in many tasks.

Thanks. BTW, it seems that you take VQA-score as a metric to evaluate VQA performance. However, how to make use of such a classification metric on generator-based method? like the EM (exact match) ways?

I wonder if you'd mind sharing a few more details about this process?

best wish. : )

j-min commented 3 years ago

For fair comparison with previous methods, I use exact match. For simplicity, I did not use any post processing or constraint search. You can use other metrics used in text generation tasks, such as n-gram based or learned metrics.

hackerchenzhuo commented 3 years ago

thanks, It is interesting that exact match could achieve such accuracy, that is great

For fair comparison with previous methods, I use exact match. For simplicity, I did not use any post processing or constraint search. You can use other metrics used in text generation tasks, such as n-gram based or learned metrics.

alice-cool commented 2 years ago

Hi! First of all, thanks for such a great work : )

I just read your reply on #1 . It is interesting.

you won't need to modify the vocabulary since T5's tokenizer is based on sentencepiece trained on a large corpus. So, I have a question and I was wondering wether the generator-based model work better than the classifier-based model? (like VQA task) What do you think about this view?

Did you run the code? I found an error about "head_mask", how to deal with it If you know the method?