welch-lab / VeloVAE

Deep Generative Modeling of RNA Velocity
BSD 3-Clause "New" or "Revised" License
35 stars 6 forks source link

All parameters must be greater than 0 #5

Closed liyarubio closed 6 months ago

liyarubio commented 7 months ago

I'm currently working with your fantastic veloVAE tool and exploring the RNA velocity dynamics in my single-cell data. I'm keen on utilizing all of my genes for this analysis without filtering any out.

However, when I run the VAE with the command vae = vv.VAE(adata, tmax=20, dim_z=5, device='cuda:0'), I encounter an error:

Estimating ODE parameters...
  0%|          | 0/1000 [00:00<?, ?it/s]
100%|██████████| 1000/1000 [00:05<00:00, 177.02it/s]
Detected 331 velocity genes.
Estimating the variance...
100%|██████████| 1000/1000 [00:00<00:00, 6984.42it/s]
Initialization using the steady-state and dynamical models.
Reinitialize the regular ODE parameters based on estimated global latent time.
100%|██████████| 1000/1000 [00:00<00:00, 1513.89it/s]
3 clusters detected based on gene co-expression.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[14], line 4
      2 torch.manual_seed(SEED)
      3 np.random.seed(SEED)
----> 4 vae = vv.VAE(adata, 
      5              tmax=20, 
      6              dim_z=5, 
      7              device='cuda:0')

File [~/miniconda3/envs/veloVAE/lib/python3.8/site-packages/velovae/model/vae.py:625](https://file+.vscode-resource.vscode-cdn.net/media/liyaru/LYR/Diff_change/6_cellDancer_simulation/~/miniconda3/envs/veloVAE/lib/python3.8/site-packages/velovae/model/vae.py:625), in VAE.__init__(self, adata, tmax, dim_z, dim_cond, device, hidden_size, full_vb, discrete, init_method, init_key, tprior, init_ton_zero, filter_gene, count_distribution, std_z_prior, checkpoints, rate_prior, **kwargs)
    622 self.dim_z = dim_z
    623 self.enable_cvae = dim_cond > 0
--> 625 self.decoder = decoder(adata,
    626                        tmax,
    627                        self.train_idx,
    628                        dim_z,
    629                        N1=hidden_size[2],
    630                        N2=hidden_size[3],
    631                        full_vb=full_vb,
    632                        discrete=discrete,
    633                        init_ton_zero=init_ton_zero,
    634                        filter_gene=filter_gene,
    635                        device=self.device,
    636                        init_method=init_method,
...
   1346 elif alpha.ndim != 1:
   1347     raise ValueError("Parameter vector 'a' must be one dimensional, "
   1348                      "but a.shape = %s." % (alpha.shape, ))

ValueError: All parameters must be greater than 0
Output is truncated. View as a [scrollable element](command:cellOutput.enableScrolling?c9964537-8103-46db-9ee9-bef612fdf324) or open in a [text editor](command:workbench.action.openLargeOutput?c9964537-8103-46db-9ee9-bef612fdf324). Adjust cell output [settings](command:workbench.action.openSettings?%5B%22%40tag%3AnotebookOutputLayout%22%5D)...

Interestingly, this error seems to disappear when I filter my genes using either scv.pp.filter_genes_dispersion(adata) or vv.preprocess(adata, n_gene=1000).

I was wondering if there's a way to perform the analysis using every single gene in my dataset. Would you happen to have any suggestions on how to approach this? Your guidance would be greatly appreciated!

Thank you for your time and for developing such a valuable tool for the community.

Best regards!

g-yichen commented 7 months ago

First, thank you for the feedback! Based on your error message, I can tell the problem happened in sampling probability from a Dirichlet prior using scipy.stats.dirichlet. This probability value represent if a gene is being repressed-only, a feature we added later on. In the latest update, I added value clipping to prevent zeros. So if you directly download the source code using git clone, the error should not disappear. However, if you installed VeloVAE using pip, the update has not been included yet and we will upload a new version. If you keep encountering such problem with the latest code on GitHub, please let me know.

At least from my own experience, gene filtering is an important step. Due to limited quality of single-cell RNA data, you are likely to get many genes with very low count numbers or not enough variation. I have seen a gene with only less than 10 read counts across all cells. VeloVAE initializes the rate parameters by fitting a steady-state model to the data for each gene first. Thus, these trivial genes might bring very weird initialization results and degrade the model performance.

Thank you!

liyarubio commented 7 months ago

I wanted to extend my heartfelt gratitude for your prompt and thorough response. Following your guidance, I reinstalled veloVAE on GitHub and was delighted to find that the previous error messages disappeared.

However, I've encountered a new challenge that I'm hoping you might be able to assist with. Currently, when I inspect the results in adata.layers['vae_velocity'], I've noticed that all the values are nan. I'm a bit stumped on how to proceed from here and would greatly appreciate any suggestions you might have.

Thank you so much for your time and help! Looking forward to your insights.

Best regards!

g-yichen commented 7 months ago

Hi, there. The problem of getting nan is most likely caused by unstable training. Although I cannot give you a definitive answer about why this happened, I encountered this problem when the rate parameters of the ODE exploded during training.

I've come up with two things you can try:

  1. One possible solution is to add the full_vb argument when you create a VAE model: vae = vv.VAE(adata, tmax=20, dim_z=5, full_vb=true, device='cuda:0'). This would add some regularization to the parameters by assuming they follow some prior distribution.

There is a default prior we use, but you can also manually specify a prior distribution of the rate parameters by adding

vae = vv.VAE(adata, 
              tmax=20, 
              dim_z=5,
              full_vb=true,
              device='cuda:0',
              rate_prior={
                  'alpha': (mean_log_alpha,  std_log_alpha),
                  'beta': (mean_log_beta,  std_log_beta),
                  'gamma': (mean_log_gamma,  std_log_gamma)
              })

where we assume the log rate parameters are Gaussian random variables. The prior is your own belief about what range the values should lie in.

  1. Change the learning rate. We use a default learning rate that increases as data sparsity increases. You can add learning rate to the configuration when calling the train function:
config = {
    'learning_rate': <some value>,
    'learning_rate_ode': <some value>,
    'learning_rate_post': <some value>
}
vae.train(adata, config=config)

We usually set the learning_rate and learning_rate_post to be ~1e-4 - 1e-3and the learning_rate_ode to be 2 to 10 times learning_rate.

I think we do need to improve the model by better handling some edge cases in the future. In addition, the current version has more flexibility for tuning, but less easy to use due to the large number of hyperparameters - there's some trade-off.

Thank you for the feedback!