eppic-team / eppic

:white_check_mark::x:Evolutionary protein-protein interface classifier
http://eppic-web.org
Other
8 stars 3 forks source link

Improve the core-surface scores #1

Closed josemduarte closed 9 years ago

josemduarte commented 10 years ago

Issue by josemduarte Tuesday May 06, 2014 at 12:20 GMT Originally opened as https://github.com/eppic-team/eppic-cli/issues/4


Some ideas that could improve the core vs surface scores:

josemduarte commented 10 years ago

Comment by josemduarte Wednesday Jun 18, 2014 at 12:54 GMT


See case 3n9z, interface 1. In eppic 2.0.5 the core-surface score for first member of interface (chain C, which is a very small protein, almost a peptide) is 25.03 (this is actually the highest score in the database to date). The cause of that is that standard deviation is extremely small, very close to 0; that happens in turn because there's only 9 free surface residues to sample and the sample size is 7 (number of residues in core), all but 2 of the 9 free residues have 0 entropy and thus the std dev goes to such small value. In 2.0.5 we require only 20% more residues in surface than the size of sample (see https://github.com/eppic-team/eppic/blob/v2.0.5/src/crk/predictors/EvolInterfZMemberPredictor.java), we should push this up to at least 50% if not 100% or more. In any case this illustrates one of the issues with the current core-surface approach, if we had a straight z-score this example would be less problematic.

gcapitani commented 10 years ago

+1 to require 100% more residues in surface than the size of sample

josemduarte commented 9 years ago

Straight z-scores are now implemented (in 3.0) and the benchmark gives similar results to classic approach.