morris-lab / CellOracle

This is the alpha version of the CellOracle package
Other
302 stars 53 forks source link

gene selection from loaded seurat object #167

Closed A-legac45 closed 9 months ago

A-legac45 commented 9 months ago

Hello,

I have a AnnData object with n_obs × n_vars = 3980 × 32043 obs: 'orig.ident', 'nCount_RNA', 'nFeature_RNA', 'nCount_ATAC_CELLranger', 'nFeature_ATAC_CELLranger', 'nucleosome_signal', 'nucleosome_percentile', 'TSS.enrichment', 'TSS.percentile', 'percent.mt', 'high.tss', 'nCount_peaks_MACS2', 'nFeature_peaks_MACS2', 'nucleosome_group', 'RNA_snn_res.0.9', 'seurat_clusters' var: 'vst.mean', 'vst.variance', 'vst.variance.expected', 'vst.variance.standardized', 'vst.variable', 'highly_variable' uns: 'pca' obsm: 'X_pca', 'X_umap.rna' varm: 'PCs' layers: 'counts', 'data' obsp: 'nn', 'snn'

Which was a seurat object at the beginnig

I am not able to run this part

Try to load variable gene list

try:

variable_gene_list1 = adata.var.variable_gene.values

If your scRNA-seq data does not include variable gene information, please calculate variable genes now.

except:

variable_gene_list = sc.pp.filter_genes_dispersion(adata.X, flavor='seurat', n_top_genes=3000, log=True)

Select genes

adata = adata[:, variable_gene_list]

le ~/opt/anaconda3/envs/celloracle_env/lib/python3.8/site-packages/scanpy/preprocessing/_deprecated/highly_variable_genes.py:203, in filter_genes_dispersion(data, flavor, min_disp, max_disp, min_mean, max_mean, n_bins, n_top_genes, log, subset, copy) 199 dispersion_norm = dispersion_norm[~np.isnan(dispersion_norm)] 200 dispersion_norm[ 201 ::-1 202 ].sort() # interestingly, np.argpartition is slightly slower --> 203 disp_cut_off = dispersion_norm[n_top_genes - 1] 204 gene_subset = df['dispersion_norm'].values >= disp_cut_off 205 logg.debug( 206 f'the {n_top_genes} top genes correspond to a ' 207 f'normalized dispersion cutoff of {disp_cut_off}' 208 )

IndexError: index 2999 is out of bounds for axis 0 with size 0

KenjiKamimoto-ac commented 9 months ago

The error you faced is the scanty function, not the celloracle. I would highly recommend getting used to single-cell data analysis with Scanpy. https://scanpy.readthedocs.io/en/stable/

A-legac45 commented 9 months ago

do you think this problem occurs when you transform seurat V4 rds to h5ad format?

Thanks


De : Kenji Kamimoto @.> Envoyé : vendredi 22 décembre 2023 17:09 À : morris-lab/CellOracle @.> Cc : Le Gac Anne-Laure @.>; Author @.> Objet : Re: [morris-lab/CellOracle] gene selection from loaded seurat object (Issue #167)

The error you faced is the scanty function, not the celloracle. I would highly recommend getting used to single-cell data analysis with Scanpy. https://scanpy.readthedocs.io/en/stable/

— Reply to this email directly, view it on GitHubhttps://github.com/morris-lab/CellOracle/issues/167#issuecomment-1867854598, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AYDTLNX4QGSHBQEQ257AD23YKWWFBAVCNFSM6AAAAABA735LCCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRXHA2TINJZHA. You are receiving this because you authored the thread.Message ID: @.***>

lpuche commented 5 months ago

Hi @A-legac45 , How did you manage to convert the seurat object in rds to h5ad format and keep obsp: 'nn', 'snn'. I am a bit struggling with it. Thanks!