Open YoussefZiad opened 2 years ago
Hi, it depends on what model you want to use (trained on ActivityNet or trained on YouCook2)
For YouCook2 see: https://github.com/gingsi/coot-videotext/issues/17
For ActivityNet we used the features provided by the authors of the CMHSE paper https://github.com/Sha-Lab/CMHSE so you would have to research their paper or code to find out how to extract the features. Kindly post here if you find the solution.
Best
@gingsi I also want to try coot with my own dataset. Did anyone can succeed that ?
Also the model trained on YouCook2
I added the feature extraction code now, see the readme chapter "Running your own video dataset on the trained models". With it you can create the Howto100m features based on mpg files.
Hello, I am trying to use the model to generate captions for external .mp4 videos and I was wondering if you could give me any pointers about how one would go about it and which functions are relevant. Thank you in advance!