ChenYi99 / EgoPlan

BSD 3-Clause "New" or "Revised" License
47 stars 5 forks source link

Train, Valid and Test dataset may have quite different characteristics #7

Closed yusuke-intern closed 1 month ago

yusuke-intern commented 1 month ago

image

I had a meta-analysis of each dataset and found interesting results.

(if we assume the narration text correctly describes the action in the video.) Please note that I may make mistakes.

yusuke-intern commented 1 month ago

image

ChenYi99 commented 1 month ago

We acknowledge that our candidate options do include actions that have already occurred in the video. However, it is important to note that the action narrations provided in the task_progress_metadata are intended to serve as a reference only. In practice, during the model inference process, using information from the ground-truth action narrations is not allowed. The model must rely solely on visual observations to infer task progress. Therefore, your approach of using the ground-truth task_progress_metadata to eliminate options is not appropriate.