debbiemarkslab / EVcouplings

Evolutionary couplings from protein and RNA sequence alignments
http://evcouplings.org
Other
231 stars 75 forks source link

Add MI as option for couplings stage #182

Open kpgbrock opened 6 years ago

kpgbrock commented 6 years ago

Would be useful to have, in addition to mean field/plm

thomashopf commented 6 years ago

Yep would be a nice addition! 👍

Some first thoughts regarding an implementation: 1) The CouplingsModel class already has functionality to compute raw and APC-corrected MI scores which might or might not worth be reusing (related to whatever we decide for point 3) 2) In many aspects, this would be exactly the same as the mean-field protocol, and the first part of the fit() method of the MeanFieldDCA class, so worth thinking about synergies 3) Needs thinking how one would handle the model file in this case (e.g., none at all, meaning no mutate stage; or just create a single-column model by default) 4) In that context would also be worth addressing #58 (choose a generic column name like "score" instead of "cn" for whatever is used downstream in the pipeline) 5) If one wanted to hack this together quick & dirty, one could use the mean-field protocol and make the MI column the score column used downstream (right now the mean-field protocol only outputs non-APC corrected MI scores, and it will also be important to consider if one uses the raw or pseudo-counted frequencies).