Closed 321356766 closed 2 years ago
Can you provide more details on the dataset that fails? How many genes/cells? Is this integration of scRNA and CITE seq such that you're adding "missing" measurements (represented as 0s) for the scRNA data?
Here are some things you can try in the meantime:
empirical_protein_background_prior=False
on init of TOTALVI
, this can be mispecified if there are missing proteinsvae.train(lr=2e-3)
for examplelatent_distribution="ln"
, use the metric="hellinger"
or metric="correlation"
for neighbors graph for better visualizationThanks for your response. The dataset is around 50000 cells and I am trying to create a joint representation of 4000 HVGs with 150+ ADTs. Interestingly, larger datasets than this work flawlessly. It is one coherent dataset-- I am not trying to integrate two datasets, one with missing measurements. I will try your suggests and report back. Many thanks!
I see. You may also check to see if there are not any (cell, protein) values in the protein matrix that are extremely large. This can be an issue with CITE-seq.
@321356766 did you make any progress on this?
Closing due to inactivity
I am receiving an error for vae.train() when using totalVI. Runs fine using the GEX only for scVI. Changing the latent_distribution to "ln" during totalvi model setup seems to bypass the error but cite-seq/scRNA-seq results do not look good.
Any insight into what could be causing this error would be much appreciated. Similar datasets (different cell donors) run fine.
Thanks