Closed franknoe closed 5 years ago
The sparsity-pattern of P is unreliable, because it depends on the maximum number of iterations, the ML estimator is instructed to perform. Does the sparsity-pattern of matrices generated by the sampler depend on the initial guess of P? That is, if some connectivity is lost in one sample, is it lost as well in all subsequent P samples?
fixed by #91
In the T-matrix sampler it is being checked whether the sparsity pattern of the initial transition matrix and the count matrix are consistent. One would think that this is always the case when we estimate the transition matrix ourselves (i.e. when the input parameter for the initial T-matrix is 'None'). However it can happen that for count matrix with very small fractional counts (which do occur e.g. in HMM estimation), the corresponding transition matrix gets zeros at the corresponding elements (probably underflow).
I can handle and avoid this problem in the HMM code, but I think it should be handled in the msmtools because it's an internal inconsistency.
I suggest the following solution: only check for consistency between C and T if T is an input. When T is estimated here, use the sparsity pattern of T. Check for connectivity before starting the sampler, and only raise an exception if the connectivity is lost.
Example: the following input
will fail with
Because the estimated transition matrix of C has zeros at elements [1,4] and [4,1](probably from an underflow).