[Docs] How to fine tune BMN when using ActionNet data prep method 2

valentin-fngr commented 4 months ago

The doc issue

Hi, I am trying to fine tune BMN on my custom dataset. I know this has been mentioned already in the issues, but I could not use the previous issues posted here to help me find a way to solve my problem.

Data Prep

I have prepared my custom dataset following the activitynet data preparation methode 2.

At the end, I obtain this exact structure :

(if Option 2 used)
│   │   ├── anet_train_video.txt
│   │   ├── anet_val_video.txt
│   │   ├── anet_train_clip.txt
│   │   ├── anet_val_clip.txt
│   │   ├── activity_net.v1-3.min.json
│   │   ├── mmaction_feat
│   │   │   ├── v___c8enCfzqw.csv
│   │   │   ├── v___dXUJsj3yo.csv
│   │   │   ├── ..
│   │   ├── rawframes
│   │   │   ├── v___c8enCfzqw
│   │   │   │   ├── img_00000.jpg
│   │   │   │   ├── flow_x_00000.jpg
│   │   │   │   ├── flow_y_00000.jpg
│   │   │   │   ├── ..
│   │   │   ├── ..

For fine-tuning BMN :

I need to modify the config file. I use RawframeDataset instead of ActivityNetDataset. Am I correct ? (if not please explain)

When using RawframeDataset, What should my pipeline look like ? The current pipline looks like this :

train_pipeline = [
dict(type='LoadLocalizationFeature'),
dict(type='GenerateLocalizationLabels'),
dict(
    type='PackLocalizationInputs',
    keys=('gt_bbox', ),
    meta_keys=('video_name', ))
]

which cannot work with RawframeDataset. How can I replicate that pipline with RawframeDataset ?

Please, provide any information that can help me successfully fine tune the model using Data preparation method number 2 for ActivityNet. This is very confusing and would love to propose an overall tutorial once I will be successful.

best,

Valentin

Suggest a potential alternative/fix

No response

Perceval-Wilhelm commented 3 months ago

Hello @valentin-fngr Can I ask you that how can you create your own custom dataset which has the same structure as ActivityNet because I am working on a Temporal Action Localization project but I cannot recreate my own custom data to have the same structure as ActivityNet. Thank you so much!

PopGreen69 commented 2 months ago

Hi @valentin-fngr，can you share the detail of data preparation? I got an issue when I extract the feature of my own dataset.

valentin-fngr commented 1 month ago

@sirrtt @PopGreen69 HI both. I gave up on that because it was way too complex to setup. I instead went for using a classic TSN recognition network with a sliding window pipeline. You can check there demo/long_video_demo.py where they demonstrate how to detect actions on a long video format. There, the setup is much easier.

open-mmlab / mmaction2