mwvgroup / pittgoogle-user-demos

GNU General Public License v3.0
2 stars 0 forks source link

Add example that pre-filters from Pub/Sub metadata #14

Open wmwv opened 1 year ago

wmwv commented 1 year ago

Show an example of a user classifiers that runs based on input

  1. [ ] Pre-filtering based on Pub/Sub metadata
  2. [ ] Pre-filtering based on a classification from some "standard" classifier that we're running.
troyraen commented 1 year ago

Currently those two examples would actually look the same. Our broker pipeline does not filter, it publishes every alert. Classifications are added to the alert packet and the Pub/Sub metadata (the latter specifically to make filtering via metadata doable).

We should keep in mind that filtering based on Pub/Sub metadata (i.e., creating a Pub/Sub subscription with a filter natively attached) does not really save money*. The only filtering option that would save the user money is for our broker to filter internally and publish topic(s) with only a subset of alerts.

*All throughput charges still apply for every message sent to the topic, regardless of whether it passes through the subscription's filter. Egress charges to the subscription do not apply to any message that does not pass the filter.

wmwv commented 1 year ago
  1. Pre-filtering based on RA, Dec. Don't care about classification.
  2. Pre-filtering based on SuperNNova result.

I appreciate that these look very similar. But they are different user stories.

And, in principle you could be looking at different steps in the pipeline.

wmwv commented 1 year ago

The money savings is not in the topic but rather in only running your classifier on a subset of alerts.

wmwv commented 1 year ago

The only filtering option that would save the user money is for our broker to filter internally and publish topic(s) with only a subset of alerts.

And yes, this is definitely something we should consider doing based on user demand and feedback.

wmwv commented 1 year ago

E.g., bright transients is certainly a category people might want, and would be much lower volume. Think about, e.g., I have a 4-m telescope, what can I take spectra of? Probably you want < 20 mag.

troyraen commented 1 year ago

E.g., bright transients is certainly a category people might want, and would be much lower volume. Think about, e.g., I have a 4-m telescope, what can I take spectra of? Probably you want < 20 mag.

That's a good example that brings up an issue we haven't quite tackled yet: Pub/Sub metadata is always strings -- thus you can do comparisons like "equal", "not equal", and "in list", but you cannot do ">" or "<". A solution we've discussed before is to bin the magnitudes before attaching as metadata. Does 0.5 mag bins seem reasonable? Then you could compare with a list like [18.0, 18.5, 19.0, 19.5] to get < 20 mag. Do you think much could be gained by going down to 0.25 mag bins? This would be a simple thing to add to the broker's "tag" module.