microsoft / CameraTraps

PyTorch Wildlife: a Collaborative Deep Learning Framework for Conservation.
https://cameratraps.readthedocs.io/en/latest/
MIT License
784 stars 246 forks source link

Training megadetector on my own images #290

Closed barlavi1 closed 2 years ago

barlavi1 commented 2 years ago

Hello, Thank you so much for this great program!

I have thousands of images from trap cameras, and while megadetector works wonderfully on some landscapes, it fails in other (it detects trees and stones as animals).

What is a useful way to train megadetector on my own set of images?

Cheers, Bar

barlavi1 commented 2 years ago

for example (both images are without animals):

detections_animal_PICT0042 detections_animal_PICT0048

agentmorris commented 2 years ago

Thanks for your interest in MegaDetector!

There isn't really a way to fine-tune MegaDetector on your data. In theory, you could draw bounding boxes on lots of animals on your own images, and use the checkpoint that we provide to resume training, but this would be a lot of work, and I don't have high expectations for the results. AFAIK no one has done this.

If you want to train a model to handle of some of those false positives, I would recommend a slightly different approach, where you crop all the high-confidence detections from MegaDetector into separate (small) images, label them as animal/false-trigger, and train a new image classifier with just those two classes. I think that would work pretty well, especially if you are training a model to handle just a specific set of cameras.

All of that said... in our experience, this isn't usually worth it. There are always some false detections, and if they are very infrequent relative to the number of empty images you have, it may not be worth all the hassle of a new machine learning project. So before you start on any training, I would take a look at the total number of images you have, the % of blanks, and the % of MD false positives, and decide whether it's really worth the effort of training a new model.

I would also take a look at this process, which we call "repeat detection elimination". It's a semi-automated process that makes it very quick to get rid of all of those rocks/sticks/etc. in a MegaDetector results set. It's a little bit of an art form and it takes some practice, so there's a learning curve, and I only recommend overcoming that learning curve when you have lots of detections like this, but it can be very efficient when you get the hang of it.

Hope that helps!

-Dan