Closed josemduarte closed 9 years ago
Comment by josemduarte Wednesday Jun 18, 2014 at 12:54 GMT
See case 3n9z, interface 1. In eppic 2.0.5 the core-surface score for first member of interface (chain C, which is a very small protein, almost a peptide) is 25.03 (this is actually the highest score in the database to date). The cause of that is that standard deviation is extremely small, very close to 0; that happens in turn because there's only 9 free surface residues to sample and the sample size is 7 (number of residues in core), all but 2 of the 9 free residues have 0 entropy and thus the std dev goes to such small value. In 2.0.5 we require only 20% more residues in surface than the size of sample (see https://github.com/eppic-team/eppic/blob/v2.0.5/src/crk/predictors/EvolInterfZMemberPredictor.java), we should push this up to at least 50% if not 100% or more. In any case this illustrates one of the issues with the current core-surface approach, if we had a straight z-score this example would be less problematic.
+1 to require 100% more residues in surface than the size of sample
Straight z-scores are now implemented (in 3.0) and the benchmark gives similar results to classic approach.
Issue by josemduarte Tuesday May 06, 2014 at 12:20 GMT Originally opened as https://github.com/eppic-team/eppic-cli/issues/4
Some ideas that could improve the core vs surface scores: