Hello, thank you for such an excellent paper and work. I am very interested in your idea, may I ask if the xpool implementation code takes the video_pooled_features in the following code (is the model defined in ClipTransformer) and then calculates the similarity?
Hello! I understand you're inquiring about a situation where training on the MSVD dataset results in the process being automatically killed by the system at a specific iteration. @fzb408
Hello, thank you for such an excellent paper and work. I am very interested in your idea, may I ask if the xpool implementation code takes the video_pooled_features in the following code (is the model defined in ClipTransformer) and then calculates the similarity?
Looking forward to your answer! Thank you!