Add computation of SV vs FAR for GWTC3 pipelines

ML4GW / aframev2

Detecting binary black hole mergers in LIGO with neural networks

MIT License

4 stars 14 forks source link

Add computation of SV vs FAR for GWTC3 pipelines #210

Closed wbenoit26 closed 2 weeks ago

wbenoit26 commented 2 weeks ago

The plotting code will now calculate the SV vs FAR for cWB, gstLAL, MBTA, PyCBC-BBH, and PyCBC-Broad and plots those values alongside the values for Aframe. This PR also switches the default units of FAR on the plots to 1/years and reduces the default maximum FAR to 1/day.

The calculation is able to calculate SV vs p_astro just as well, so we can integrate those plots if we want to once our own calculation is ready.

I don't see a way around including the large data file we need for making these calculations. I've stripped out any columns that we don't use, and it's still pretty big.

EthanMarx commented 2 weeks ago

Great stuff looks good. Couple Qs

How big is the file? Might make sense to add a function that downloads the file, or even just point people to the zenodo link

Would it be possible down the line to massage the data into a format that can be ingested by our own SV code (as opposed to Tris)? More of a sanity checking thing.

wbenoit26 commented 2 weeks ago

The file seems to be bigger on github for some reason: du -sh says it's 36 MB, but github says it's 78 MB. I like the idea of downloading it, I'll add that in.

In theory, that should be possible. I tried it a while back and got turned around somewhere, but it's worth trying that again.

wbenoit26 commented 2 weeks ago

Right, I remember now: the dataset doesn't have the parameters from the rejected draws. They normalize by the total number of draw, rather than by the sum of the weights of the accepted and rejected draws, which I'm fairly certain we decided is incorrect, though I'd need to go back and look at our work. To be fair, for our data, there's a less than 0.2% difference between the values.

So we can't quite get the data into the format we'd need, but we can get close enough to do it in a notebook.