I am trying to train action detection model using TSN network for a different custom dataset.
In the original TSN paper it is mentioned that along with rgb data, rgb differences and optical flow is passed into the model. Moreover the entire video is divided into snippets and then model prediction is done based on the results of it. I couldn't find that type of implementation here or in mmaction repository. Please help me in knowing how can I implement the action detection task on a custom dataset as mentioned in the paper. Thanks.
When I am trying to train I am able to do with either rgb or flow and cannot train both together. The accuracy is stuck in between 60-70%. My config file is attached..
config.txt
I am trying to train action detection model using TSN network for a different custom dataset. In the original TSN paper it is mentioned that along with rgb data, rgb differences and optical flow is passed into the model. Moreover the entire video is divided into snippets and then model prediction is done based on the results of it. I couldn't find that type of implementation here or in mmaction repository. Please help me in knowing how can I implement the action detection task on a custom dataset as mentioned in the paper. Thanks.