Online performance metrics

ML4GW / aframe

Detecting binary black hole mergers in LIGO with neural networks

MIT License

17 stars 17 forks source link

Online performance metrics #281

Open alecgunny opened 1 year ago

alecgunny commented 1 year ago

Part of the appeal of using neural networks for real-time detection is their comparatively small compute footprint. Any publication on their efficacy should show this explicitly by showing how throughput trades off against latency for different batch sizes (since samples earlier in the batch will have extra latency compared to samples later in the batch).

In production, required throughput is actually just decided by your inference sampling rate, so worth also showing performance (in whatever way you want to define performance) as a function of inference sampling rate to get a full illustration of the tradeoffs involved.

alecgunny commented 1 year ago

Should focus on single GPU performance to show minimal compute requirements, possibly even showing how it can live beside another network like DeepClean

alecgunny commented 1 year ago

One issue here is that latency is actually kind of a red herring from a compute standpoint since our latency is bottlenecked by the 1s required for integration. We can either:

Ignore this and just talk about the time required to make NN predictions, not the time required to turn those into triggers
Show VT vs. integration time for multiple FARs/M1-mass ratio combinations (i.e. for a given FAR threshold and M1-R combination, what VT $y$ do I achieve by integrating only over the first $x$ samples post-trigger)

deepchatterjeeligo commented 1 year ago

I am noting down a discussion I had with @EthanMarx in the local group meeting about VT for certain representative mass values. The procedure is:

Take 4-5 representative component mass combinations: Say (30, 30), (40, 40), (40, 30) etc. in source frame
Take say order 10000 injections, distribute them based on the same cosmology as the training/validation dataset. Convert source frame mass to detector frame and create a test set by injecting.
The VT for each specific mass combination is N_rec/N_tot V0 T
This should be fairly less inference time. The expected scaling is chirp_mass^(15/6) between representative mass values (amplitude cubed).

EthanMarx commented 1 year ago

I really like this idea since it will ensure we have enough samples to make a measurement at a given mass bin.

alecgunny commented 1 year ago

This makes sense, but this issue is referring specifically to "performance" (that unfortunately overloaded word) in the sense of NN inference latency/throughput. It has implications for measurements of the quality of the NN's predictions (e.g. VT), in that these measurements might scale with inference sampling rate which determines your online throughput, but it might be good to make this same note in somewhere like #271 or even creating a separate issue for simplifying and accelerating the VT calculation with this simplified method

alecgunny commented 1 year ago

Or better yet maybe an issue for "Figures of Merit" where we can start sharing ideas

EthanMarx commented 1 year ago

Yeah this was the wrong issue to discuss this in. I'll migrate it.