Closed ydwl-lynn closed 3 years ago
Hi, I'd welcome any discussion. You're right, svo outcome can only partially adjust the decoder instead of controlling it completely. I think there are two reasons, 1) decoder still heavily relies on linguistic prior, e.g., word co-occurrence, to optimize the gram-level cross-entropy loss; 2) svo prediction is still far from satisfactory.
Thank you very much for your reply。I still have some details that I would like to discuss with you. If it is convenient, please send your WeChat to my mailbox(1007369109@qq.com)。Thank you very much!
Hello,I reproduced the results of your paper, and I found that some of the video output svo is correct but the caption is wrong. What is the reason?If it is convenient, I would like to discuss further with you.Thank you very much.