sanketsans / ganov2

Winner of CVPR23 EGO4D STA challenge
https://sanketsans.github.io/guided-attention-egocentric.html
Apache License 2.0
7 stars 1 forks source link

What is cfg.EGO4D_STA.OBJECT_DETECTIONS? #1

Open KeenyJin opened 8 months ago

KeenyJin commented 8 months ago

Isn't it the "object_detections.json" file that we can get from "ego4d_data/v2/sta_models/", like the SlowFast example at https://github.com/EGO4D/forecasting/blob/main/SHORT_TERM_ANTICIPATION.md#producing-object-detections-optional?

I'm getting errors :(

FileExistsError: [Errno 17] File exists: '../sta/object_detections/object_detections.json'

I guess cfg.EGO4D_STA.OBJECT_DETECTIONS must be a directory, but I have no clue.

sanketsans commented 8 months ago

Hi @KeenyJin , The json file provided in the original repo only provides the object detection for the last observed frame. Since our model require the detections in the observed frame(s) as well, we extracted the object detection for all videos. I am afraid I did not provide the object detection annotations. You can access the object detection for the entire "V2" of STA split here. https://drive.google.com/drive/folders/1CHsFZ-6jl9ypT_yvK7ukwHJ9_g5_tmOE?usp=sharing It contains the detection per video file in a lmdb format, which somewhat helps in faster loading as compared to json files. The cfg.EGO4D_STA.OBJECT_DETECTIONS provides the path to the lmdb directory. Let me know if you face other problems. :)

KeenyJin commented 7 months ago

@sanketsans I appreciate your help! Thanks to the object detection annotations you provided, now I can see the training ongoing. I have two more things, though. 1) Do the annotations contain object detections for "all frames" or for "frames used at training(16 per annotation)"? 2) I think the __len__(self) at class Ego4dShortTermAnticipationStill in stillfast/datasets/ego4d_sta_still.py must be fixed. def __len__(self): """ Get the number of samples. """ return len(self._annotations['annotations'][:50]) [:50] makes only 32 steps per epoch in my setting, hence the training ends in an instant with 0 accuracy. After I erased it, there are 53364 steps per epoch (training+validation) and 1 epoch takes 3~4 hours. Or is there anything I'm missing?

Thank you :)

sanketsans commented 7 months ago

Hi,

  1. I think the object detections are for all the frames.
  2. The len function should just be returning the total number of instances for a given set ie.; training, validation or test set. It should just be
def __len__(self):
    """ Get the number of samples. """
    return len(self._annotations[‘annotations’])
ubless607 commented 5 months ago

@sanketsans Is this correct?