gjhhust / YOLOFT

A code base for the official XS-VID dataset baseline method YOLOFT
GNU Affero General Public License v3.0
10 stars 1 forks source link

Using the provided model file to test the picture has an error. #5

Closed assertdebug closed 4 months ago

assertdebug commented 4 months ago

image image

assertdebug commented 4 months ago

Sorry , I forgot to specify the input size

assertdebug commented 4 months ago

I use the model file provided and cannot get the correct result

assertdebug commented 4 months ago

I use the model file provided and cannot get the correct result image

image

gjhhust commented 4 months ago

I use the model file provided and cannot get the correct result image

image

Currently does not support the direct use of yolo commands to reason, please use a script to get the results, and YOLOFT model is suitable for video reasoning, You need to use the build_stream_dataloader input model for video detection. You can refer to https://github.com/gjhhust/YOLOFT/blob/main/tools/test_flowft_show.py

Yu-zhengbo commented 3 months ago

I use the model file provided and cannot get the correct result image

image

Currently does not support the direct use of yolo commands to reason, please use a script to get the results, and YOLOFT model is suitable for video reasoning, You need to use the build_stream_dataloader input model for video detection. You can refer to https://github.com/gjhhust/YOLOFT/blob/main/tools/test_flowft_show.py

你好,看了你写的代码,build_stream_dataloader好像返回的也是一帧一帧的图像,因为看了你的论文里面描述训练阶段是多帧输入,所以想请教一下是我理解错了吗

gjhhust commented 2 months ago

I wrote predict scripts for images and videos, For more information, please read readme.md.

and Sorry only just noticed your question, the paper describes streaming input, i.e. only one image input at a point in time (consistent with real world acquisition of images)