Closed Hrovatin closed 9 months ago
Indeed, changing the above line to corrected.obsm["latent"] = all_corrected_data[corrected.obs_names,:].X
fixes the issue.
I also needed to add .detach()
to self.module.generative(torch.Tensor(all_corrected_data.X))["px"].cpu().detach().numpy()
Hi Karin
Thanks for pointing this out, could you kindly add that as a PR we can merge it then
Btw see here you can do the same thing with cpa:
https://cpa-tools.readthedocs.io/en/latest/tutorials/Batch_correction_in_expression_space.html
The PR is here: https://github.com/theislab/scgen/pull/87 I would just merge despite black failing as I didn't introduce any major formatting changes except the 4 lines as mentioned above
I have tried to integrate some of my own data and then reproduce the example from https://scgen.readthedocs.io/en/stable/tutorials/scgen_batch_removal.html , but it seems that the latent data and the obs are not joined correctly, creating wrong cell latent embedding-label pairs.
This is the result from the tutorial, with a clear mismatch between cell type clusters and cell labels![image](https://github.com/theislab/scgen/assets/47607471/be5da960-885a-49e2-a230-901d49de27f2)
I think the reason could be in https://github.com/theislab/scgen/blob/06084773e56cad0dec340138441dee47a39af752/scgen/_scgen.py#L315C16-L315C16 as you don't check that indices match, but I haven't tested it so it may be a different reason.
scGEN version: 2.1.1
ps. the tutorial also has other mistakes, like cell_type->celltype and the use_rep is missing in neighbours computation for latent