Closed avinashsai closed 1 year ago
Hello,
Congratulations on the amazing work. I have a few questions about zero-shot evaluation in Table-1.
Thank you.
Hi Avinash,
Hi Tsu-Jui,
Thanks for your reply.
If that is the case, I will suggest using the fused features, which consider both vision and language perception.
You mean video outputs and fused features?
Yes
Hello,
Congratulations on the amazing work. I have a few questions about zero-shot evaluation in Table-1.
Thank you.