Thoughts on DeepStreaks v2

DeepStreaks has been a very successful system, and was a big leap forward back in 2019, however now I consider it largely obsolete and inadequate in terms of code/infrastructure quality, the DL models themselves, and the overall setup.

Below are my thoughts on a possible DeepStreaks v2.

I see two alternatives:

A Tails-like system.
- It would utilize a similar architecture and operate on (tessellated) full-frame image triplets (SCI, REF, DIFF).
- It will detect real streaks and output PSF-fit-like parameters directly.
- This would require some major effort and would be quite computationally expensive (as is Tails). Data would need to be collected and labeled similar to the way it was done for Tails.
If the current streak detection algorithm is kept intact (or replaced with a superior one, but kept external), then I suggest to treat the detections as a "streak alert stream", similar to how transients are handled.
- Each "packet" would contain the fitted streaked PSF params, metadata, and the cutouts (SCI, REF, and DIFF, and not just the DIFF as is the case ATM).
- Similar to how the drb scoring is implemented for transients, it would make sense to run DeepStreaks on the IPAC side and add the classifications to the packets. The stream would be then consumed, filtered using the classifications (i.e. ditching >99.5% of the stream, although we could then also plug the stream into Kowalski and save it all, similar to the regular alert stream), and human-vetted on Fritz, with the potential to leverage the unparalleled capabilities (for, e.g. follow-up) that it has to offer. For a reference end-to-end implementation, see this PR on Kowalski.
- The models themselves will most likely be simplified by a lot given that a lot more information will be available to the classifiers (see e.g. https://github.com/ZwickyTransientFacility/scope/pull/6), so the overall system will be significantly less computationally expensive than the existing solution.
- Overall, it seems like less effort will be required for this option with the caveat that the system performance will be limited by the streak detection algorithm performance.

In both cases, an adequate infrastructure will be desirable: MLOps, CI/CD with GH Actions, code review and all things DevOps etc.

dmitryduev / DeepStreaks

Thoughts on DeepStreaks v2 #14