soedinglab / CCMpred

Protein Residue-Residue Contacts from Correlated Mutations predicted quickly and accurately.
http://www.ncbi.nlm.nih.gov/pubmed/25064567
GNU Affero General Public License v3.0
107 stars 25 forks source link

Raw matrix and initialization #14

Closed tomblaze closed 5 years ago

tomblaze commented 5 years ago

I am interested in using the MRF parameters and wanted to do a sanity-check comparing values from your code and my own. I believe the raw files should have this information.

One thing I'm not understanding is how the initialization option works. My expectation would be that if I train a model, save the "raw" parameters, and then re-run the code with option "-i" set to the same raw file, re-starting the training should have the f(x) scores initialized at the same place.

However even on example input this doesn't seem to be the case. See below, first I run:

$ ./bin/ccmpred -r example.raw example/1atzA.aln ignore.mat
 _____ _____ _____               _ 
|     |     |     |___ ___ ___ _| |
|   --|   --| | | | . |  _| -_| . |
|_____|_____|_|_|_|  _|_| |___|___|
                  |_|              

using CPU (1 thread(s))

Reweighted 3068 sequences with threshold 0.8 to Beff=1188.7 weight mean=0.387452, min=0.00364964, max=1

Will optimize 2482125 32-bit variables

iter    eval    f(x)        ║x║         ║g║         step
1       1       171122      13287.4     77283600    9.23e-05
2       1       166286      13285.7     57418396    7.9e-05
3       1       161400      13287.1     47159776    7.96e-05
4       1       156480      13293.5     38807448    7.96e-05
5       1       151540      13307.4     32292716    8.35e-05
... [omit for space] ...
34      1       92653       17526.3     466339.81   0.000295
35      1       92414.3     17429       401723.19   0.000392
36      2       92304.7     17349.2     357860.28   0.000339
37      3       92271.6     17316.8     326918.62   0.000141

Done with status code 0 - Success!

Final fx = 92237.171875

Writing raw output to example.raw
xnorm = 45.9138
Output can be found in ignore.mat

Then I run...

$ ./bin/ccmpred -i example.raw example/1atzA.aln ignored_again.mat
 _____ _____ _____               _ 
|     |     |     |___ ___ ___ _| |
|   --|   --| | | | . |  _| -_| . |
|_____|_____|_|_|_|  _|_| |___|___|
                  |_|              

using CPU (1 thread(s))

Reweighted 3068 sequences with threshold 0.8 to Beff=1188.7 weight mean=0.387452, min=0.00364964, max=1

Will optimize 2482125 32-bit variables

iter    eval    f(x)        ║x║         ║g║         step
1       1       217448      17261.1     89485856    8.01e-05
2       1       212143      17223.2     61828972    8.12e-05
3       1       206582      17167.9     61673300    9.52e-05
4       1       200697      17105.4     58197632    7.96e-05
5       1       194752      17041.1     48117040    6.94e-05
... [omit again] ...
38      1       95287.8     18296.1     1738278.2   0.000234
39      1       94772.1     18116.7     1783485.6   0.000287
40      2       94525.9     17983.2     1694389.1   0.000199
41      2       94393.8     17898.2     1464499.6   0.000128
42      1       94267.9     17809.4     1294781.1   0.000144
43      1       94152.6     17708.9     1365385.8   0.000175

Done with status code 0 - Success!

Final fx = 94085.921875

xnorm = 48.5448
Output can be found in ignored_again.mat

It seems that the ||x|| variable is related between the two runs but it seems odd to me that the f(x) value is markedly higher than where it ends up in first round of optimization and that ultimately optimization from this initialization takes longer and ends up at a worse optimum. Perhaps I am misunderstanding something but for now this seems counter-intuitive and I'd appreciate any clarification you could give.

Thanks!

grubermar commented 5 years ago

Hey @tomblaze ,

CCMPred uses the Conjugate Gradient method for optimization. It iteratively computes the partial derivatives w.r.t. each of the parameters and estimates the step width (i.e. how far it goes in the direction of the gradient) using some approximation instead of the exact width (which would require to compute the Hessian). This approximation depends on the last steps which the algorithm did and develops continuously.

If you load the parameters from a previous run, the parameters of the Conjugate Gradients are not set to the last state but initialized again. It also does not have access to all the previous steps. Since you already are in a local optimum after the first run, the initial parameters of the Conjugate Gradient are probably way to "big", which means instead of getting closer to the previous optimum, it over steps an slowly optimizes towards another optimum (or back to the first one, i don't remeber whether the pseudo likelihood function is unimodal).

This all means that to continue from where you left off after the first run, you would also have to save all the parameters from the Conjugate Gradient and set them at initialization too. We implemented Conjugate Gradients in libconjugrade, so maybe you want to take a look at that library. Look for parameters like gprevnorm or prevalpha which are used by the next iteration step in conjugrad.c.

I hope this helps.

Greets, Markus

croth1 commented 5 years ago

@grubermar @tomblaze, I think there is a bug in the read_raw method. I submitted a PR, but won't have the time to verify the fix today.

tomblaze commented 5 years ago

Thanks. I imagine you'll want to fully double-check the change and you can take your time doing that but making the change locally does make the output more sane on re-initialization.

First pass:

iter    eval    f(x)        ║x║         ║g║         step
1       1       171122      13287.4     77283600    9.23e-05
2       1       166286      13285.7     57418396    7.9e-05
3       1       161400      13287.1     47159776    7.96e-05
...
36      2       92304.7     17349.2     357860.28   0.000339
37      3       92271.6     17316.8     326918.62   0.000141

Done with status code 0 - Success!

Second pass:

iter    eval    f(x)        ║x║         ║g║         step
1       2       92130.3     17267.3     306641.62   0.000922
2       1       92028.7     17243.6     298147.44   0.000564
3       1       91928.7     17213.1     234786.17   0.000518
4       1       91840.1     17174.3     256977.78   0.000635
5       2       91786.8     17145.4     76054.055   0.000348

Done with status code 0 - Success!
croth1 commented 5 years ago

@tomblaze, I just merged the fix into master. Thanks for pointing it out 👍