Closed liyarubio closed 6 months ago
First, thank you for the feedback! Based on your error message, I can tell the problem happened in sampling probability from a Dirichlet prior using scipy.stats.dirichlet
. This probability value represent if a gene is being repressed-only, a feature we added later on. In the latest update, I added value clipping to prevent zeros. So if you directly download the source code using git clone
, the error should not disappear. However, if you installed VeloVAE using pip, the update has not been included yet and we will upload a new version. If you keep encountering such problem with the latest code on GitHub, please let me know.
At least from my own experience, gene filtering is an important step. Due to limited quality of single-cell RNA data, you are likely to get many genes with very low count numbers or not enough variation. I have seen a gene with only less than 10 read counts across all cells. VeloVAE initializes the rate parameters by fitting a steady-state model to the data for each gene first. Thus, these trivial genes might bring very weird initialization results and degrade the model performance.
Thank you!
I wanted to extend my heartfelt gratitude for your prompt and thorough response. Following your guidance, I reinstalled veloVAE on GitHub and was delighted to find that the previous error messages disappeared.
However, I've encountered a new challenge that I'm hoping you might be able to assist with. Currently, when I inspect the results in adata.layers['vae_velocity']
, I've noticed that all the values are nan
. I'm a bit stumped on how to proceed from here and would greatly appreciate any suggestions you might have.
Thank you so much for your time and help! Looking forward to your insights.
Best regards!
Hi, there. The problem of getting nan
is most likely caused by unstable training. Although I cannot give you a definitive answer about why this happened, I encountered this problem when the rate parameters of the ODE exploded during training.
I've come up with two things you can try:
full_vb
argument when you create a VAE model: vae = vv.VAE(adata, tmax=20, dim_z=5, full_vb=true, device='cuda:0')
. This would add some regularization to the parameters by assuming they follow some prior distribution. There is a default prior we use, but you can also manually specify a prior distribution of the rate parameters by adding
vae = vv.VAE(adata,
tmax=20,
dim_z=5,
full_vb=true,
device='cuda:0',
rate_prior={
'alpha': (mean_log_alpha, std_log_alpha),
'beta': (mean_log_beta, std_log_beta),
'gamma': (mean_log_gamma, std_log_gamma)
})
where we assume the log rate parameters are Gaussian random variables. The prior is your own belief about what range the values should lie in.
train
function:config = {
'learning_rate': <some value>,
'learning_rate_ode': <some value>,
'learning_rate_post': <some value>
}
vae.train(adata, config=config)
We usually set the learning_rate
and learning_rate_post
to be ~1e-4
- 1e-3
and the learning_rate_ode
to be 2 to 10 times learning_rate
.
I think we do need to improve the model by better handling some edge cases in the future. In addition, the current version has more flexibility for tuning, but less easy to use due to the large number of hyperparameters - there's some trade-off.
Thank you for the feedback!
I'm currently working with your fantastic veloVAE tool and exploring the RNA velocity dynamics in my single-cell data. I'm keen on utilizing all of my genes for this analysis without filtering any out.
However, when I run the VAE with the command
vae = vv.VAE(adata, tmax=20, dim_z=5, device='cuda:0')
, I encounter an error:Interestingly, this error seems to disappear when I filter my genes using either
scv.pp.filter_genes_dispersion(adata)
orvv.preprocess(adata, n_gene=1000)
.I was wondering if there's a way to perform the analysis using every single gene in my dataset. Would you happen to have any suggestions on how to approach this? Your guidance would be greatly appreciated!
Thank you for your time and for developing such a valuable tool for the community.
Best regards!