Open carlamao opened 3 years ago
@carlamao I am trying to do the same for COOT video captioning, But I don't know the steps. Did you succeed in extracting the features. Can you share me the steps you followed if you get any success?
I used the feature provided with ActivityNet Entities and manually modified the annotation files because I couldn't find a way to compute those features.
Hi, I am trying to follow COOT's implementation using a different dataset, ActivtiyNet-Entities. They used your model to extract the features. What steps should I take to do so?