sming256 / OpenTAD

OpenTAD is an open-source temporal action detection (TAD) toolbox based on PyTorch.
Apache License 2.0
138 stars 10 forks source link

Can you provide an inference example for a video? #30

Open zgsxwsdxg opened 1 month ago

zgsxwsdxg commented 1 month ago

Hello! First of all, thank you for your constructive work on video action detection. Now, based on my own data, I've trained an adaTAD model with your framework, but I can't deploy it. I want to deploy a single machine and a single card for inference,can you provide some ideas or code examples for a video inference example?

Thank you very much! Looking forward to hearing from you!

sming256 commented 1 month ago

Thanks for your interest in our codebase.

If you want to infer a model on the test set, you can use the following command (using 1 GPU) torchrun --nnodes=1 --nproc_per_node=1 --rdzv_backend=c10d --rdzv_endpoint=localhost:0 tools/test.py configs/adatad/xxxx.py --checkpoint exps/xxxx.pth

If you want to infer only one video, a temporary solution is to build a dataset with only one video. I will also create an API for this purpose in the next version of OpenTAD. However, it may be released next month.

zgsxwsdxg commented 1 month ago

Thanks for your interest in our codebase.

If you want to infer a model on the test set, you can use the following command (using 1 GPU) torchrun --nnodes=1 --nproc_per_node=1 --rdzv_backend=c10d --rdzv_endpoint=localhost:0 tools/test.py configs/adatad/xxxx.py --checkpoint exps/xxxx.pth

If you want to infer only one video, a temporary solution is to build a dataset with only one video. I will also create an API for this purpose in the next version of OpenTAD. However, it may be released next month.

First of all, thank you for your patience and prompt reply. I looked at tools/test.py code, using torch.distributed to load data and model,at the same time, the inference is carried out, and the inference results are collected and combined. This is difficult for my actual deployment scenario. My application scenario is as follows:I want to deploy multiple adaTAD models on each card;when each model is loaded, it is possible to specify gpu_id. For one model inference, there is no need for torch.distributed. I just need to read in one video file, then preprocess, then the AI model inference, and the result was given.

Because, I need to deploy the service,but the application of this torch.distributed made it difficult for me. I don't know how to modify this test.py file to meet my needs. Can you provide some ideas?For example, after load from a video, what kind of pre-processing do I need to do, then reason to get the result, and what kind of post-processing do I need to do at the end? If you have time, I would like you to guide me. Now I'm stuck at this point, and I'm in a hurry.

Thank you again and look forward to your reply!