Closed davidbenjamin closed 8 years ago
Note: this will close issue #371 (fast PCA) because probabilistic PCA models are best learned via a fast EM algorithm.
Note: this is able to issue #381 (deal with copy number events in the panel of normals / cohort) because if copy ratio / copy number is part of the generative model then learning the parameters of the model will take that into account.
Closed by PR #416. Now there are several tickets for the implementation.
Generative models with the raw data as an observed node have many advantages over approaches in which the raw data is pre-processed. We would like a generative model relating observed read counts to hidden copy number state.
This will be easiest to try first on our germline code, for two significant reasons:
If this succeeds, it would then not be so hard to surmount those obstacle for somatic calling, but there is no good reason not to do the simpler thing first.