Open dmitryduev opened 3 years ago
These thoughts are fine. The biggest question though is of time/effort. How inefficient is the current system? The suggested system will need effort on multiple fronts. How about discussing the possible timeline at the next ML meeting?
DeepStreaks has been a very successful system, and was a big leap forward back in 2019, however now I consider it largely obsolete and inadequate in terms of code/infrastructure quality, the DL models themselves, and the overall setup.
Below are my thoughts on a possible DeepStreaks v2.
I see two alternatives:
drb
scoring is implemented for transients, it would make sense to run DeepStreaks on the IPAC side and add the classifications to the packets. The stream would be then consumed, filtered using the classifications (i.e. ditching >99.5% of the stream, although we could then also plug the stream into Kowalski and save it all, similar to the regular alert stream), and human-vetted on Fritz, with the potential to leverage the unparalleled capabilities (for, e.g. follow-up) that it has to offer. For a reference end-to-end implementation, see this PR on Kowalski.In both cases, an adequate infrastructure will be desirable: MLOps, CI/CD with GH Actions, code review and all things DevOps etc.