HMMER weights IUPAC degenerate emissions using the reciprocal of the perplexity of the underlying match state (see esl_abc_FExpectScore function in HMMER3 source)
This has the effect that the "score" for those emissions is the expectation of what you'd get if you randomized X's using the underlying emission distribution - much to the chagrin of Roger Sewell, who argued they should be treated as missing data (Sean's counterargument is that this
would reward their alignment to the model) - this is an old argument
Practically (as noted by @jordisr) this affects <1% of sequences, but for full hmmer compatibility we ought to include it.
HMMER weights IUPAC degenerate emissions using the reciprocal of the perplexity of the underlying match state (see
esl_abc_FExpectScore
function in HMMER3 source)This has the effect that the "score" for those emissions is the expectation of what you'd get if you randomized X's using the underlying emission distribution - much to the chagrin of Roger Sewell, who argued they should be treated as missing data (Sean's counterargument is that this would reward their alignment to the model) - this is an old argument
Practically (as noted by @jordisr) this affects <1% of sequences, but for full hmmer compatibility we ought to include it.