Investigate mass distribution edge effects

chrismessenger commented 7 years ago

I am generating datasets for the metric distribution with 0.5 Msun extra boundaries in mass for the training data only.

We need to see if that solves our problem.

mj-will commented 7 years ago

I've seen the datasets and shall the run the CNN on them over the weekend.

mj-will commented 7 years ago

I've run the CNN with a few variations on the new data sets, here's a selection of mass distributions for some of the better runs: http://www.astro.gla.ac.uk/~2136420/Mdist/

chrismessenger commented 7 years ago

Hi Michael,

I'm not quite sure what to take away from these results. Is there any overall improvement in the network i.e., via ROC or efficient curves?

One thing I'd like to modify in these grid plots is what we're plotting. We could plot the deviation from the total accuracy in units of standard deviation

The correct and incorrect results are binomial distributed with a probability p governing the fraction of correct results. If we take the overall accuracy to be our estimate for p then the error in a box containing n signals is sqrt(np(1-p)), e.g., if there are 100 signals in a mass box and the overall p = 0.98 then the error is 1.4. So if you actually count 99 correct, that's within 1.4 of the expected 98. Even getting 100 correct is OK since it's just outside the 1-sigma error. However, if you count 90 correct then that's ~6 sigma away from the expected.

So if you plotted how many sigma each result is away from expected this might help illuminate any spurious results.

hagabbar / matching_cnn_paper

Investigate mass distribution edge effects #23