I found that In the YouCook2 dataset, in the annotation, some of the segment is over 500.
Such as {"segment": [512, 586], "id": 11, "sentence": "grill and rotate the skewers"}
However, the video feature contains only 500 frames.
How to deal with these training data?
I found that In the YouCook2 dataset, in the annotation, some of the segment is over 500. Such as {"segment": [512, 586], "id": 11, "sentence": "grill and rotate the skewers"} However, the video feature contains only 500 frames. How to deal with these training data?