alankbi / detecto

Build fully-functioning computer vision models with PyTorch
https://detecto.readthedocs.io/
MIT License
612 stars 107 forks source link

Works perfectly on image input but inaccurate on video input #112

Open ghost opened 2 years ago

ghost commented 2 years ago

Describe the bug Very accurate on input images but not so much on video inputs

Code and Data image image

Code is identical to the Detecto Colab demo

Environment: Windows 7 but I'm on Colab torch.version = 1.11.0+cu113 torchvision.version = 0.12.0+cu113

Additional context It's kind of weird since I've tried several Coca-Cola images aside from what I uploaded and it was perfect. It just doesn't recognize it when the input is a video file.

Is there something wrong with the video input method?

alankbi commented 2 years ago

There shouldn't be much of a difference between image vs. video inference, the underlying model is still the same. Maybe try diversifying your set of training images to include more far away or varied shots? From the image above it seems like those images are cleaner (less background, no shadows, etc.) than the video footage, which could lead to the issue you're having.