HYPJUDY / Decouple-SSAD

Decoupling Localization and Classification in Single Shot Temporal Action Detection
https://arxiv.org/abs/1904.07442
MIT License
96 stars 19 forks source link

Using different dataset #6

Closed sanjitjain2 closed 5 years ago

sanjitjain2 commented 5 years ago

How do I use this on my own dataset?

HYPJUDY commented 5 years ago

You may Preprocess Data by Yourself and them train and test models following the process in run-code session.

sanjitjain2 commented 5 years ago

Can you explain more about training the Base Feature Network as my dataset is different and I cannot use the pretrained weights as mentioned. Thanks in advance.

HYPJUDY commented 5 years ago

Since the input features have gone through big network like ResNet and pretrained on other datasets (e.g., UCF101 or Kinetics), they contain many informations. The Base Feature Network is mainly used to shorten the temporal length of feature map and increase the size of receptive fields, which do not need very complex design. I adopt the design of Conv-pool-Conv-pool similar to Single Shot Temporal Action Detection. You can refer to the Section 4.4 Architectures of SSAD network for more details.