Closed DesaleF closed 3 years ago
We use ground-truth timestamps.
Okay!! Thank you. One more question: is there anyway to generate timestamp generation rather than using the ground truth. If you have any recommendation to generate a coherent paragraph with a proposed timestamp by the network.
You would need to use someone else's pretrained model for something like "dense event prediction" to get timestamps and then build the meta_all.json for your dataset with these new timestamps. Let us know if you find a good way to predict these timestamps.
Thank you very much for your suggestion. I will check that and I will post it here if I got success to predict the timestamps.
When generating captions for testing or validation did you use the ground truth timestamp to generate each sentence or COOT can just generate the paragraph caption without using the ground truth timestamp?