shaunpwilkinson / aphid

Analysis with Profile Hidden Markov Models
21 stars 4 forks source link

Posterior method not currently available for profile HMMs #3

Open leonjessen opened 5 years ago

leonjessen commented 5 years ago

Hi Shaun,

I've been looking in your code and it seems, that posterior decoding for profile HMMs is currently stopped for all input. Will this be rectified any time soon?

> Viterbi(x = my_phmm_trained, y = test_seq_bin)
Optimal path with length 15 and score 8.542646
> forward(x = my_phmm_trained, y = test_seq_bin)
Log odds score:  8.867508
> backward(x = my_phmm_trained, y = test_seq_bin)
Log odds score:  8.867508
> posterior(x = my_phmm_trained, y = test_seq_bin)
Error in posterior.PHMM(x = my_phmm_trained, y = test_seq_bin) : 
  Posterior method not currently available for profile HMMs
shaunpwilkinson commented 5 years ago

Hi @leonjessen, I have written a method for posterior decoding for profile HMMs, but just a bit unsure about which output format users would find most useful. Keen to hear if you have a preference? The options as I see them are: a) the full n x m x 3 array of posterior probabilities (where n is the sequence length, m is the number of modules in the model and the 3 matrices are for the match, delete and insert states) b) just the n x m matrix for the match state c) the function also runs the Viterbi algorithm to find the optimal alignment and return the posterior probabilities of the optimal path of the sequence through the model d) some combination of the above as a special object type e) other (open to ideas!)

I've enabled option (a) in the development version now, and will push the changes to CRAN once finalized. Cheers, Shaun

leonjessen commented 5 years ago

Hi @shaunpwilkinson, Option looks nice, so I'll give that a spin, when you're ready 👍 🙂 Cheers and bw. Leon