sming256 / OpenTAD

OpenTAD is an open-source temporal action detection (TAD) toolbox based on PyTorch.
Apache License 2.0
174 stars 11 forks source link

Extending to new datasets #29

Open reesekneeland opened 3 months ago

reesekneeland commented 3 months ago

Hello!

Thanks for putting together this toolbox. I have a private dataset of continuous videos (up to about 30m in length) comprising about a half dozen different skills that I would like to recognize. I have annotated timestamps for the activities and a reasonable amount of training data, and I would like to train some of these architectures on my dataset for my task.

As I ramp on learning this codebase it seems like there are some elements missing that I might need, most notably the feature extraction code. Can you provide me with a brief overview of the steps I will need to take to get these models training on my private dataset? Is this something that will be possible with existing code around this repository?

sming256 commented 3 months ago

Thanks for your interest in this codebase.

If you want to try out the methods on your own dataset, you basically need feature extraction + train a model and tune the hyper-parameters. Since I am busy with other things these weeks, I will try to upload the feature extraction code by the end of this month. At the same time, you can refer to this code for feature extraction.

I will also create a step-by-step guide for training and evaluating on your own dataset, but maybe upload in the next month. Sorry for the delay.

pbhfcycssjlmm commented 2 months ago

To verify the accuracy of the annotation files in a self-made dataset, you can prioritize running the TemporalMaxer.

In the Thumos14-I3D-Pad mode, factors such as duration, segment, fps, stride, and offset interact with each other. During the execution of random cropping, if the annotations are incorrect, it can result in empty gt_segment. Other models have not reported errors regarding this, but TemporalMaxer is sensitive enough to catch it.

The general path of how annotation errors lead to training errors includes: In the prepare_targets function of opentad/models/dense_heads/temporalmaxer_head.py, the self.assigner.assign method is involved. This will involve the _assign function in opentad/models/losses/assigner/anchor_free_simota_assigner.py, specifically the dynamic_k_matching. If your annotations are wrong and result in no gt_segment in the cropped results, the del command will throw an error.

Additionally, the relevant code locations for executing the cropping are as follows, for the convenience of future readers to understand and sort out:

In opentad/datasets/thumos.py, the ThumosPaddingDataset executes the pipeline part, which can be seen in the configuration file configs/_base_/datasets/thumos-14/features_i3d_pad.py where RandomTrunc is specified. This code can be found in opentad/datasets/transforms/loading.py.


要验证自制数据集标注文件准确性的话,可以优先跑TemporalMaxer。

在Thumos14-I3D-Pad的模式下,duration-segment-fps-stride-offset等因素联动起来之后,执行随机裁剪工作过程中,如果标注错误会产生空gt_segment,其他模型对此都没有报错过,只有TemporalMaxer对此足够敏感。

标注错误导致训练报错的大致路径包括: opentad/models/dense_heads/temporalmaxer_head.py prepare_targets函数中的self.assigner.assign。这会涉及到opentad/models/losses/assigner/anchor_free_simota_assigner.py中的_assign函数中的dynamic_k_matching,如果你标注不对导致裁切结果中没有gt_segment,del命令就会报错。

另外,执行裁切的相关代码位置如下,方便后来人理解与梳理:

opentad/datasets/thumos.py中ThumosPaddingDataset执行pipeline的部分,而这可以在配置文件configs/base/datasets/thumos-14/features_i3d_pad.py中看到是指定了RandomTrunc——该代码可以在opentad/datasets/transforms/loading.py中找到。

256-7421142 commented 2 weeks ago

Thanks for your interest in this codebase.

If you want to try out the methods on your own dataset, you basically need feature extraction + train a model and tune the hyper-parameters. Since I am busy with other things these weeks, I will try to upload the feature extraction code by the end of this month. At the same time, you can refer to this code for feature extraction.

I will also create a step-by-step guide for training and evaluating on your own dataset, but maybe upload in the next month. Sorry for the delay. Are these uploaded to the code repository now?

256-7421142 commented 2 weeks ago

Thanks for your interest in this codebase.

If you want to try out the methods on your own dataset, you basically need feature extraction + train a model and tune the hyper-parameters. Since I am busy with other things these weeks, I will try to upload the feature extraction code by the end of this month. At the same time, you can refer to this code for feature extraction.

I will also create a step-by-step guide for training and evaluating on your own dataset, but maybe upload in the next month. Sorry for the delay.

Hello,I have extracted the feature for my own datasets according to this code ,but how to use it during training in code?