lsst-uk / lasair-project-management

Event handling site for LSST:UK
Apache License 2.0
9 stars 0 forks source link

Streaming query including current candidates from the Avro packet #273

Open genghisken opened 2 years ago

genghisken commented 2 years ago

Would it be possible to alter streaming query to add pertinent candidates directly onto the Kafka cache as well as the results of the SQL query (since they already exist in the Avro packet)?

This would be very useful for Annotators that are required to operate on the lightcurve information (e.g. only the last 30 days). At the moment we make a Streaming Query, pull out the object IDs and then use the API to pull down all the lightcurve info for each object. This works, but it's slow. Doing it via Kafka means we only need to hit the queue a single time with no API calls until we come to write the annotations back. (We can deal with duplicate candidates in the queue easily enough.)

RoyWilliams commented 2 years ago

That would not be difficult. The menu that goes with a filter 0:inactive 1:streaming (email) 2:streaming (kafka) would get an other type: 3:streaming+lightcurve (kafka)

genghisken commented 2 years ago

Yes - I really like this. Means that the query is compatible with normal queries but streaming query has the extra ability to add lightcurves. Definitely would greatly speed up Michael's queries, provided he is aware that he's not getting the whole lightcurve, just the candidates that came with the alert.

The only thing I don't quite understand is what happens when the same object gets two alerts within (e.g.) 7 days. On day 1, there are 1 month's candidates included for an object. On day 7 there are the same candidates but the window has moved forward by 7 days. Presumably the user just gets candidate duplicates and they need to be careful to merge them...