Open alecgunny opened 1 year ago
Should focus on single GPU performance to show minimal compute requirements, possibly even showing how it can live beside another network like DeepClean
One issue here is that latency is actually kind of a red herring from a compute standpoint since our latency is bottlenecked by the 1s required for integration. We can either:
I am noting down a discussion I had with @EthanMarx in the local group meeting about VT for certain representative mass values. The procedure is:
I really like this idea since it will ensure we have enough samples to make a measurement at a given mass bin.
This makes sense, but this issue is referring specifically to "performance" (that unfortunately overloaded word) in the sense of NN inference latency/throughput. It has implications for measurements of the quality of the NN's predictions (e.g. VT), in that these measurements might scale with inference sampling rate which determines your online throughput, but it might be good to make this same note in somewhere like #271 or even creating a separate issue for simplifying and accelerating the VT calculation with this simplified method
Or better yet maybe an issue for "Figures of Merit" where we can start sharing ideas
Yeah this was the wrong issue to discuss this in. I'll migrate it.
Part of the appeal of using neural networks for real-time detection is their comparatively small compute footprint. Any publication on their efficacy should show this explicitly by showing how throughput trades off against latency for different batch sizes (since samples earlier in the batch will have extra latency compared to samples later in the batch).
In production, required throughput is actually just decided by your inference sampling rate, so worth also showing performance (in whatever way you want to define performance) as a function of inference sampling rate to get a full illustration of the tradeoffs involved.