Open willtownes opened 3 years ago
Hi @willtownes,
thanks for your interest in the method.
In general, we recommend in most cases to use the Gaussian likelihood in combination with a suitable preprocessing (see also some guidelines/recommendations here) to take data characteristics into account while providing a good tradeoff in terms of scalability and performance. We added a small-scale simulation example for a simple illustration of the Poisson likelihood here.
The numerical problems that result in the nan
-values seem to be an issue in the underlying MOFA model on this data set. We will take a look at this and let you know once it is fixed. Thanks for reporting the bug!
Hi @willtownes,
just as quick update: We fixed the numerical issues which you encountered on the Poisson likelihood. If you install mofapy2 from the dev branch (pip install git+https://github.com/bioFAM/mofapy2@dev
) the error above should be fixed. We will merge this in the coming versions with the master branch and PyPI. However, as mentioned above, Gaussian likelihood + a suitable pre-processing might still be a better choice for the spatial transcriptomics data.
OK I have tested this and while it no longer has the numerical divergence error early in training, it seems to have some weird behavior and never converged. Below is a plot of the ELBO with the horizontal axis representing the number of epochs. I'm not sure why the ELBO periodically drops precipitously.
Hi Will,
this looks strange, we will have a look. It seems to be specific to the combination of sparse GPs with a Poisson likelihood. For now, we'd recommend to use either a Gaussian likelihood or a Poisson likelihood in conjunction with a full GP model (setting sparseGP = False
). I will update here once we have fixed the problem above.
Hi, congratulations on this awesome method! I am interested in trying MEFISTO with the Poisson likelihood. I have been following the tutorial for the Visium brain data, but it seems to run into numerical problems after the first few iterations. Here is the code I have been using:
At iteration 12 the ELBO becomes nan and after iteration 19 it says "Optimising sigma node..." then raises an exception:
UnboundLocalError: local variable 'best_lidx' referenced before assignment