orcasound / aifororcas-livesystem

Real-time AI-assisted killer whale notification system (model and moderator portal) :star:
http://orcahello.ai4orcas.net/
MIT License
36 stars 25 forks source link

Retrain model(s) and compare performance #105

Open scottveirs opened 2 years ago

scottveirs commented 2 years ago

As of the 2024 Hackathon, we have about 4000 false positive candidates that have been moderated by me, Dave @dbainj1 , or Val. Each candidate is 60-seconds long and contains a number of model predictions (each with a start time and confidence level).

It would be ideal to iteratively retrain the model using these false positives. We have aspired to do such retraining annually during the Microsoft hackathons, but might eventually aspire to doing it more frequently -- like monthly or whenever we reach a threshold of new false positives. Some day we may partition the false positives by location to fine tune models for hydrophones in specific geographic locations.

Here are some resources and steps that could be used to implement a re-training solution:

api/detections/false-positives
api/detections/confirmed

The Swagger API documentation allows you to explore the data scheme and JSON results of a query:

Screenshot 2024-09-17 at 3 27 22 PM

The same work flow could also query the API for new candidates that moderators have confirmed are true positives.

scottveirs commented 2 months ago

See this discussion May 2024 post by Bret regarding model performance/comparison options...