Alanine dipeptide multiple trajs poor scores / validation scores higher than training

rafwiewiora commented 5 years ago

Hi!

I'm trying to reproduce the alanine dipeptide notebooks. While the single trajectory one appears ok:

the 3 trajs (multiple files) one scores poorly and strangely has validation scores higher than the training scores. I observe these consistently over 50 attempts:

All newest releases, except tensorflow which is 1.12. I remember @ppxasjsm saw the same thing when she took us through her attempts a while ago.

cc @jchodera

amardt commented 5 years ago

Hi,

thank you for your comment.

The reason for this behavior is that the data loader was up to now not randomly accessing frames from the trajectory but instead loading batchsize long fragments from the trajectories where not all transitions appear (1st batch frame numbers: 0-batchsize, 2nd batch frame numbers: batchsize-2*batchsize ...). This will result into lower scores, since we just take the mean over all batches. Anyway, this does not mean that the trained model is in the end performing worse. If one would evaluate the model on the whole trajectory, the score was most likely as high as in the single trajectory case.

The lower training score compared to the validation score must be also result of that. I guess the trajectory we assigned as validation data, has just more transitions in each batch on average.

The new commit should fix this issue, since we now included a data loader with random accessibility. For us the score behaves now as in the case of one trajectory also during training. c88ed3f0fa4da7149dd6526fe8ea997fad56b0fa

If you have more questions/issues let us know!

Best #

rafwiewiora commented 5 years ago

Fantastic, much appreciated!

markovmodel / deeptime

Alanine dipeptide multiple trajs poor scores / validation scores higher than training #29