gerardDonahue / GTCC_CVPR2024

Code for paper titled, "Learning to Predict Task Progress by Self-Supervised Video Alignment" by Gerard Donahue and Ehsan Elhamifar, published at CVPR 2024.
8 stars 0 forks source link

Is the action time label neccessary? #1

Closed AnnCumtb closed 4 months ago

AnnCumtb commented 4 months ago

Thanks for your great work! I'd like to ask could your work to be unsupervised? or don't need ("hdl_start_times" and "hdl_end_times" are thee corresponding start and end frames of the actions in "hdl_actions" for each handle in "handles) because to get that kind of labels is costly. Thanks again for your excellent work!

gerardDonahue commented 4 months ago

Thanks for your question!

Short answer: Our work is self-supervised (uses the procedural task only as supervision). Without the labels, our evaluation code won't work - as all of the evaluation metrics in the code require the labels.

If you are interested in training the model without these labels, I would go directly into models/json_dataset.py:JSONDataset class and remove all parts that extract the time-based labels. You can remove the times-dict annotations from the dataloader (just use empty dicts for items of the dataset). There may be some small bugs if you remove these - but you can most likely comment out the code that uses the labels during training, as GTCC is a self-supervised approach that does not require time-based labels for training.

AnnCumtb commented 4 months ago

thanks for your reply. I get that your time-based labels only for evaluation, I'll try to train on my own dataset withou the labels.Thanks a lot.