ML4GW / aframev2

Detecting binary black hole mergers in LIGO with neural networks
MIT License
6 stars 14 forks source link

Whiten with background stream #154

Closed EthanMarx closed 3 months ago

EthanMarx commented 3 months ago

Implements whitening of injection data with the background stream. This is done by sending a state update of size (2, num_ifos, time) where the first element is used to calculate the psd to whiten the second element.

In Alec's implementation of this, it was assumed that background and injection data were always being analyzed together. So, one could use the background data to calculate the PSD, whiten both the background and injection streams, and run inference all in one call to client.infer.

However, now that #152 is merged, it's not necessarily true that we are inferring on both injections and background for each segment, so we can't just do this all in one client.infer call.

So, the solution I came up with was even when we call client.infer on the background stream, I am redundantly sending 2 copies of the background data. I don't think this will be a bottleneck, but worth keeping note of.

If it's possible for the snapshotter to dynamically receive either 1 or 2 batched updates we could get around this, although I'm not sure it is.

EthanMarx commented 3 months ago

This also closes #153

wbenoit26 commented 3 months ago

Is it not the case that the injected segments are a subset of the background segments? In other words, whenever we infer on foreground, shouldn't there always be a corresponding background segment to use for the PSD?

wbenoit26 commented 3 months ago

Oh, no, I see - if we have more foreground time shifts than background time shifts, that won't be the case

EthanMarx commented 3 months ago

Is it not the case that the injected segments are a subset of the background segments? In other words, whenever we infer on foreground, shouldn't there always be a corresponding background segment to use for the PSD?

The way our inference / waveform generation is currently set up is that we assume we are doing more background timeslides than injection timeslides.

If we infer on injections, we are also inferring on background but not vice versa

wbenoit26 commented 3 months ago

The way our inference / waveform generation is currently set up is that we assume we are doing more background timeslides than injection timeslides.

If we infer on injections, we are also inferring on background but not vice versa

Wait, in that case, why is this a problem?

EthanMarx commented 3 months ago

Theres no "problem" - we just have to send redundant data to the server for background inference, which may or may not be a bottleneck (FWIW I saw good gpu util during a test run).

Just wanted to make note of that

wbenoit26 commented 3 months ago

I was thinking that, if we're always inferring on background when we infer on injections, that we could then use Alec's approach. But I see that doing so would actually break things when there weren't any injections. Okay, I'm good now.

EthanMarx commented 3 months ago

I was thinking that, if we're always inferring on background when we infer on injections, that we could then use Alec's approach. But I see that doing so would actually break things when there weren't any injections. Okay, I'm good now.

Yes exactly