morris-lab / CellOracle

This is the alpha version of the CellOracle package
Other
299 stars 50 forks source link

oracle.import_TF_data(TF_info_matrix=base_GRN) doesn't find overlap between base GRN and my scRNAseq data #120

Closed lpuche closed 1 year ago

lpuche commented 1 year ago

Hello! I'm trying to do a GRN network analysis with my own dataset which consists on a single-cell Multiome experiment. I am following the pipeline of https://morris-lab.github.io/CellOracle.documentation/notebooks/04_Network_analysis/Network_analysis_with_Paul_etal_2015_data.html. However when I am making the Oracle object with my data (after performing the previous steps as the pipeline said), I have a warning:

"Found No overlap between TF info (base GRN) and your scRNA-seq data. Please check your data format and species."

The scRNAseq data is an Anndata object with the following framework: AnnData object with n_obs × n_vars = 7615 × 3000 obs: '....... 'Ident.celltype4', ........ var: 'n_counts', 'var_names', 'variable_gene' obsm: 'X_diffmap', 'X_draw_graph_fa', 'X_integrated_lsi', 'X_pca', 'X_rna.umap', 'X_spca', 'X_umap', 'X_umap.atac', 'X_wnn.umap' layers: 'raw_count'

The base_GRN that I try to import is: Out[131]: peak_id gene_short_name ... Zscan31 Zscan4 0 chr10_100486738_100488164 Gm47325 ... 0.0 0.0 1 chr10_10110040_10110327 4930598N05Rik ... 0.0 0.0 2 chr10_101516838_101517271 Mgat4c ... 0.0 0.0 3 chr10_102155038_102155266 Gm47031 ... 0.0 0.0 4 chr10_103402140_103402665 Gm47220 ... 0.0 0.0 ... ... ... ... ... 8074 chrY_1009802_1010842 Eif2s3y ... 0.0 0.0 8075 chrY_1244876_1246221 Uty ... 0.0 0.0 8076 chrY_872372_873585 Gm28587 ... 0.0 0.0 8077 chrY_872372_873585 Rhoay-ps3 ... 0.0 0.0 8078 chrY_897165_898301 Kdm5d ... 0.0 0.0

The Oracle object after trying to import the anndata and the base_GRN objects is:

Meta data celloracle version used for instantiation: 0.12.0 n_cells: 7615 n_genes: 3000 cluster_name: Ident.celltype4 dimensional_reduction_name: X_draw_graph_fa n_target_genes_in_TFdict: 7480 genes n_regulatory_in_TFdict: 1094 genes n_regulatory_in_both_TFdict_and_scRNA-seq: 0 genes n_target_genes_both_TFdict_and_scRNA-seq: 0 genes k_for_knn_imputation: NA Status Gene expression matrix: Ready BaseGRN: Ready PCA calculation: Not finished Knn imputation: Not finished GRN calculation for simulation: Not finished

Here you can see that these 2 parameters are with 0 genes: n_regulatory_in_both_TFdict_and_scRNA-seq: 0 genes n_target_genes_both_TFdict_and_scRNA-seq: 0 genes

Please, could you tell me how I can find an overlap between these 2 objects? Thanks in advance,

Best regards, Lorenzo

lpuche commented 1 year ago

Hi! I have found the issue here. It was the anndata object type. When it was loading to my environment in spyder by sc.read/ sc.read_h5ad, this object was without the uns annotation. Now, I have done it since the beginning with all the variables loaded and It was fine.

Best, Lorenzo