Plot offline P vs online P for CFHTLS stage 2 subjects

cpadavis commented 10 years ago

Here are some plots sure to generate some discussion between us:

(I also did it for stage 1)

stage2: stage2

stage1: stage1

cpadavis commented 10 years ago

The difference between the two corners is whether I plot in log space or not.

These are with the training enabled. I should make some plots with that factor disabled.

I'm not sure what to make of the floating central cluster for the stage 2 probabilities. They seem to be points the online system is not totally set on, but the offline system is pretty confident about. I think the broad distribution at low P somewhat reflects the fact that the offline system doesn't clip at low P values.

drphilmarshall commented 10 years ago

NB. All the axes are showing the subject P values, here.

The difference between the stage 2 histograms is pretty striking. Offline SWAP rejects very few subjects, and instead has this pile up at high P. That's going to lead to a lot of false positives, isn't it? I guess the ROC curve should reassure us, but still. Its going to be very interesting to inspect the high offline P, low online P images! I guess we this is how the recovery of the false negatives works.
Interesting that at stage 1 there is not the same pile-up of rejections at the low P end as there is in online SWAP. This is a bit surprising: I have come to expect the crowd to be very good at rejecting non-lenses. Maybe its just a recalibration, with the different offline PD and PL values? Actually it would be good to look at that, to see how the online and offline bureaux differ in PD and PL.

Did you check in an offline catalog into projects/CFHTLS? We are set up to do our own "expert" inspection but we need a catalog for input. Thanks!

On Tue, Aug 5, 2014 at 7:25 PM, cpadavis notifications@github.com wrote:

The difference between the two corners is whether I plot in log space or not.

These are with the training enabled. I should make some plots with that factor disabled.

I'm not sure what to make of the floating central cluster for the stage 2 probabilities. They seem to be points the online system is not totally set on, but the offline system is pretty confident about. I think the broad distribution at low P somewhat reflects the fact that the offline system doesn't clip at low P values.

— Reply to this email directly or view it on GitHub https://github.com/drphilmarshall/SpaceWarps/issues/57#issuecomment-51286780 .

anupreeta27 commented 10 years ago

what's the Prejection for offline?

drphilmarshall commented 10 years ago

I don't think there is a rejection threshold in P in the offline version... To emulate retirement we'd have to run the offline analysis every week's worth of classifications or so. This would be interesting to do with our current stage 1 data, but maybe not necessary now. for paper 1, we should focus on understanding stage 2, I think.

On Wed, Aug 6, 2014 at 7:05 PM, anupreeta27 notifications@github.com wrote:

what's the Prejection for offline?

— Reply to this email directly or view it on GitHub https://github.com/drphilmarshall/SpaceWarps/issues/57#issuecomment-51422379 .

drphilmarshall commented 10 years ago

Quick update on this one: Anu's pulling out the P_offline values for the things we have already inspected, and also the offline sample ranked by P_offline. I am optimistic of a better correlation between P and expert grade! We'll see.

anupreeta27 commented 10 years ago

Poffline and Ponline vs. expert grades: no significant difference found. P values from both swap runs do not show any obvious correlation with expert grades of the known+new lens candidates

Poffline vs Ponline: Subjects with lens candidates (known+new from onl.) get systematically higher P offl. values, roughly these subjects have P>0.4

drphilmarshall / SpaceWarps

Plot offline P vs online P for CFHTLS stage 2 subjects #57