theislab / scgen

Single cell perturbation prediction
https://scgen.readthedocs.io
GNU General Public License v3.0
260 stars 52 forks source link

batch_remove() function do not return cell name, only gene name in normalized matrix #4

Closed nhuhoa closed 5 years ago

nhuhoa commented 5 years ago

Dear Naghipourfar and M0hammadL, I encountered another bug with scGen in the "batch_removal" function. I tested this function one month ago. I am not sure if you have fixed it or not. So I just want to let you know here. The batch_removal() function in utils.py return a normalized matrix without cell names, only gene names. For visualization tSNE, UMAP, clustering of this normalized matrix, there are no problem. But if you want to use these cells for downstream analysis, we need cells name.

I tested program with 2 batches. I fixed this bug by adding an observation adata.obs['cell_name'] to keep cells name. But I think you can do it better. adata_latent.obs["cell_name"] = adata.obs["cell_name"].tolist() corrected.obs["cell_name"] = all_shared_ann.obs["cell_name"].tolist() corrected.obs["cell_name"] = all_shared_ann.obs["cell_name"].tolist() + all_not_shared_ann.obs[ "cell_name"].tolist() corrected.obs_names = corrected.obs['cell_name']

Thanks, Best, Hoa Tran