kundajelab / DMSO

5 stars 3 forks source link

TODO for paper #4 : Re-train top-performing model with ENCODE weight inits #13

Open annashcherbina opened 7 years ago

annashcherbina commented 7 years ago

Compare to performance on random initializations.

annashcherbina commented 7 years ago

I have fixed the batch normalization bug. I am now seeing the improvement with ENCODE weight initializations -- though it averages out to only 1% improvement on auPRC. I think that doesn't tell the full story though.

RecallAtFDR50

there are 4 tasks that have validation split values of 0 with the base init but get over 0.4 with the ENCODE init. The overall average improvement on the validation split is 0.26 (.33 for random init, 0.59 for ENCODE init).

image

auROC

mean improvement = 0.06 (0.79 for random init, 0.85 for ENCODE init): image

auPRC

mean improvement = 0.01 (0.54 for random init, 0.55 for ENCODE init): image

All task values:

image

I am puzzled why my validation performance is higher than my training performance across all metrics. The validation loss is considerably higher than the training loss (0.95 for training vs 12.4 for validation)

annashcherbina commented 7 years ago

I know we discussed that it's not valid to average curves, so we can disregard the black central lines on these graphs, but these are the ROC & PRC curves for ENCODE init vs Random init (each colored line represents the ROC / PRC for a given task): dmso encode_init train dmso encode_init valid dmso rand_init train dmso rand_init valid

akundaje commented 7 years ago

Nice. Those are big jumps where it matters the most ie. Recall at specific FDRs

On Nov 8, 2017 1:30 PM, "annashcherbina" notifications@github.com wrote:

I have fixed the batch normalization bug. I am now seeing the improvement with ENCODE weight initializations -- though it averages out to only 1% improvement on auPRC. I think that doesn't tell the full story though. RecallAtFDR50

there are 4 tasks that have validation split values of 0 with the base init but get over 0.4 with the ENCODE init. The overall average improvement on the validation split is 0.26 (.33 for random init, 0.59 for ENCODE init).

[image: image] https://user-images.githubusercontent.com/5261545/32575213-448befc8-c488-11e7-85b3-bcc78e3dbbd7.png auROC

mean improvement = 0.06 (0.79 for random init, 0.85 for ENCODE init): [image: image] https://user-images.githubusercontent.com/5261545/32575319-809ea8a2-c488-11e7-8dff-11541728ba08.png auPRC

mean improvement = 0.01 (0.54 for random init, 0.55 for ENCODE init): [image: image] https://user-images.githubusercontent.com/5261545/32575372-aa35be62-c488-11e7-91b7-5fa714f501c0.png All task values:

[image: image] https://user-images.githubusercontent.com/5261545/32575401-b9b09d26-c488-11e7-855d-976e364bbe26.png

I am puzzled why my validation performance is higher than my training performance across all metrics. The validation loss is considerably higher than the training loss (0.95 for training vs 12.4 for validation)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kundajelab/DMSO/issues/13#issuecomment-342966467, or mute the thread https://github.com/notifications/unsubscribe-auth/AAI7EcNfVMUKsJ1PcXFDGnCokuyHE5dOks5s0h2JgaJpZM4QNEcs .