ykat0 / capital

BSD 3-Clause "New" or "Revised" License
17 stars 3 forks source link

out of range for plotting gene expression trends #4

Open TerezaClarence opened 1 year ago

TerezaClarence commented 1 year ago

Dear CAPITAL developers,

I am reaching again regarding the issue of plotting any sort of gene expression trends in custom dataset. When I run the tutorial and example data, everything works (except for plotting the trajectory tree as described in previous issue).

However, when I follow the exactly same approach on my data, I am unable to plot gene expression trends for any selected genes for which the similarity score was calculated.

adata1 = sc.read("./all_preprocessed_cdDNA_RNA3.h5ad")
adata1

AnnData object with n_obs × n_vars = 60767 × 17608 obs: 'orig.ident', 'nCount_RNA', 'nFeature_RNA', 'gex_barcode', 'atac_barcode', 'is_cell', 'excluded_reason', 'gex_raw_reads', 'gex_mapped_reads', 'gex_conf_intergenic_reads', 'gex_conf_exonic_reads', 'gex_conf_intronic_reads', 'gex_conf_exonic_unique_reads', 'gex_conf_exonic_antisense_reads', 'gex_conf_exonic_dup_reads', 'gex_exonic_umis', 'gex_conf_intronic_unique_reads', 'gex_conf_intronic_antisense_reads', 'gex_conf_intronic_dup_reads', 'gex_intronic_umis', 'gex_conf_txomic_unique_reads', 'gex_umis_count', 'gex_genes_count', 'atac_raw_reads', 'atac_unmapped_reads', 'atac_lowmapq', 'atac_dup_reads', 'atac_chimeric_reads', 'atac_mitochondrial_reads', 'atac_fragments', 'atac_TSS_fragments', 'atac_peak_region_fragments', 'atac_peak_region_cutsites', 'percent.mt', 'nCount_ATAC', 'nFeature_ATAC', 'sex', 'age', 'mitoRatio', 'percent.ribo', 'riboRatio', 'percent.hb', 'log10GenesPerUMI', 'nucleosome_signal', 'nucleosome_percentile', 'TSS.enrichment', 'TSS.percentile', 'pct_reads_in_peaks', 'blacklist_fraction', 'brain', 'brain.bank', 'nCount_SCT', 'nFeature_SCT', 'SCT_snn_res.0.2', 'seurat_clusters', 'ATAC_snn_res.0.2', 'SCT.weight', 'ATAC.weight', 'wsnn_res.0.2', 'RNA_snn_res.0.2', 'pANN_0.25_0.09_1648', 'DF.classifications_0.25_0.09_1648', 'pANN_0.25_0.09_1940', 'DF.classifications_0.25_0.09_1940', 'pANN_0.25_0.09_2285', 'DF.classifications_0.25_0.09_2285', 'pANN_0.25_0.09_1241', 'DF.classifications_0.25_0.09_1241', 'SCT_snn_res.0.4', 'ATAC_snn_res.0.4', 'wsnn_res.0.4', 'm1c_labels_subclass', 'age.group', 'seurat_clusters_origBB', 'm1c_labels_subclass_origBB', 'anno_clus', 'anno_clus_origBB', 'anno_clus_origBB2', 'SCT.dream2BB.weight', 'ATAC.dream2BB.weight', 'seurat_clusters_dream.origBB', 'seurat_clusters_dreamorigBB', 'm1c_labels_subclass.dreamBB', 'anno_clus_dreamBB', 'anno_clus_dreamorigBB', 'OPC_progenitor1', 'OPC_precursor1', 'Oligo_precursor1', 'preOligo1', 'immOligo1', 'matOligo_non1', 'matOligo_mye1', 'TypeI_opc1', 'TypeII_opc1', 'TypeI_Olig1', 'TypeII_Olig1', 'anno_clus_dreamorigBB_bioarx', 'anno_clus_dreamorigBB_v2' var: 'vst.mean', 'vst.variance', 'vst.variance.expected', 'vst.variance.standardized', 'vst.variable'

opc_oligACC = adata1[adata1.obs['brain'].isin(['ACC'])]
opc_oligDLPFC = adata1[adata1.obs['brain'].isin(['DLPFC'])]

random.seed(11) cp.tl.preprocessing(opc_oligACC, n_Top_genes=2000, N_pcs = 30) cp.tl.preprocessing(opc_oligDLPFC, n_Top_genes=2000, N_pcs = 30) opc_oligACC

AnnData object with n_obs × n_vars = 15527 × 2000 obs: 'orig.ident', 'nCount_RNA', 'nFeature_RNA', 'gex_barcode', 'atac_barcode', 'is_cell', 'excluded_reason', 'gex_raw_reads', 'gex_mapped_reads', 'gex_conf_intergenic_reads', 'gex_conf_exonic_reads', 'gex_conf_intronic_reads', 'gex_conf_exonic_unique_reads', 'gex_conf_exonic_antisense_reads', 'gex_conf_exonic_dup_reads', 'gex_exonic_umis', 'gex_conf_intronic_unique_reads', 'gex_conf_intronic_antisense_reads', 'gex_conf_intronic_dup_reads', 'gex_intronic_umis', 'gex_conf_txomic_unique_reads', 'gex_umis_count', 'gex_genes_count', 'atac_raw_reads', 'atac_unmapped_reads', 'atac_lowmapq', 'atac_dup_reads', 'atac_chimeric_reads', 'atac_mitochondrial_reads', 'atac_fragments', 'atac_TSS_fragments', 'atac_peak_region_fragments', 'atac_peak_region_cutsites', 'percent.mt', 'nCount_ATAC', 'nFeature_ATAC', 'sex', 'age', 'mitoRatio', 'percent.ribo', 'riboRatio', 'percent.hb', 'log10GenesPerUMI', 'nucleosome_signal', 'nucleosome_percentile', 'TSS.enrichment', 'TSS.percentile', 'pct_reads_in_peaks', 'blacklist_fraction', 'brain', 'brain.bank', 'nCount_SCT', 'nFeature_SCT', 'SCT_snn_res.0.2', 'seurat_clusters', 'ATAC_snn_res.0.2', 'SCT.weight', 'ATAC.weight', 'wsnn_res.0.2', 'RNA_snn_res.0.2', 'pANN_0.25_0.09_1648', 'DF.classifications_0.25_0.09_1648', 'pANN_0.25_0.09_1940', 'DF.classifications_0.25_0.09_1940', 'pANN_0.25_0.09_2285', 'DF.classifications_0.25_0.09_2285', 'pANN_0.25_0.09_1241', 'DF.classifications_0.25_0.09_1241', 'SCT_snn_res.0.4', 'ATAC_snn_res.0.4', 'wsnn_res.0.4', 'm1c_labels_subclass', 'age.group', 'seurat_clusters_origBB', 'm1c_labels_subclass_origBB', 'anno_clus', 'anno_clus_origBB', 'anno_clus_origBB2', 'SCT.dream2BB.weight', 'ATAC.dream2BB.weight', 'seurat_clusters_dream.origBB', 'seurat_clusters_dreamorigBB', 'm1c_labels_subclass.dreamBB', 'anno_clus_dreamBB', 'anno_clus_dreamorigBB', 'OPC_progenitor1', 'OPC_precursor1', 'Oligo_precursor1', 'preOligo1', 'immOligo1', 'matOligo_non1', 'matOligo_mye1', 'TypeI_opc1', 'TypeII_opc1', 'TypeI_Olig1', 'TypeII_Olig1', 'anno_clus_dreamorigBB_bioarx', 'anno_clus_dreamorigBB_v2', 'n_genes', 'leiden' var: 'vst.mean', 'vst.variance', 'vst.variance.expected', 'vst.variance.standardized', 'vst.variable', 'n_cells', 'highly_variable', 'means', 'dispersions', 'dispersions_norm' uns: 'log1p', 'hvg', 'pca', 'neighbors', 'diffmap_evals', 'umap', 'leiden', 'paga', 'leiden_sizes' obsm: 'X_pca', 'X_diffmap', 'X_umap' varm: 'PCs' obsp: 'distances', 'connectivities'

cp.tl.trajectory_tree(opc_oligACC, root_node="5", groupby="leiden", tree=None) cp.tl.trajectory_tree(opc_oligDLPFC, root_node="6", groupby="leiden", tree=None)

------- pairwise alignment of trajectories

ACC_DLPFC = cp.tl.tree_alignment(opc_oligACC, opc_oligDLPFC, num_genes1=2000, num_genes2=2000) Calculating tree alignment 837 genes are used to calculate cost of tree alignment. ACC_DLPFC CapitalData(adata1=AnnData object with n_obs × n_vars = 15527 × 2000 obs: 'orig.ident', 'nCount_RNA', 'nFeature_RNA', 'gex_barcode', 'atac_barcode', 'is_cell', 'excluded_reason', 'gex_raw_reads', 'gex_mapped_reads', 'gex_conf_intergenic_reads', 'gex_conf_exonic_reads', 'gex_conf_intronic_reads', 'gex_conf_exonic_unique_reads', 'gex_conf_exonic_antisense_reads', 'gex_conf_exonic_dup_reads', 'gex_exonic_umis', 'gex_conf_intronic_unique_reads', 'gex_conf_intronic_antisense_reads', 'gex_conf_intronic_dup_reads', 'gex_intronic_umis', 'gex_conf_txomic_unique_reads', 'gex_umis_count', 'gex_genes_count', 'atac_raw_reads', 'atac_unmapped_reads', 'atac_lowmapq', 'atac_dup_reads', 'atac_chimeric_reads', 'atac_mitochondrial_reads', 'atac_fragments', 'atac_TSS_fragments', 'atac_peak_region_fragments', 'atac_peak_region_cutsites', 'percent.mt', 'nCount_ATAC', 'nFeature_ATAC', 'sex', 'age', 'mitoRatio', 'percent.ribo', 'riboRatio', 'percent.hb', 'log10GenesPerUMI', 'nucleosome_signal', 'nucleosome_percentile', 'TSS.enrichment', 'TSS.percentile', 'pct_reads_in_peaks', 'blacklist_fraction', 'brain', 'brain.bank', 'nCount_SCT', 'nFeature_SCT', 'SCT_snn_res.0.2', 'seurat_clusters', 'ATAC_snn_res.0.2', 'SCT.weight', 'ATAC.weight', 'wsnn_res.0.2', 'RNA_snn_res.0.2', 'pANN_0.25_0.09_1648', 'DF.classifications_0.25_0.09_1648', 'pANN_0.25_0.09_1940', 'DF.classifications_0.25_0.09_1940', 'pANN_0.25_0.09_2285', 'DF.classifications_0.25_0.09_2285', 'pANN_0.25_0.09_1241', 'DF.classifications_0.25_0.09_1241', 'SCT_snn_res.0.4', 'ATAC_snn_res.0.4', 'wsnn_res.0.4', 'm1c_labels_subclass', 'age.group', 'seurat_clusters_origBB', 'm1c_labels_subclass_origBB', 'anno_clus', 'anno_clus_origBB', 'anno_clus_origBB2', 'SCT.dream2BB.weight', 'ATAC.dream2BB.weight', 'seurat_clusters_dream.origBB', 'seurat_clusters_dreamorigBB', 'm1c_labels_subclass.dreamBB', 'anno_clus_dreamBB', 'anno_clus_dreamorigBB', 'OPC_progenitor1', 'OPC_precursor1', 'Oligo_precursor1', 'preOligo1', 'immOligo1', 'matOligo_non1', 'matOligo_mye1', 'TypeI_opc1', 'TypeII_opc1', 'TypeI_Olig1', 'TypeII_Olig1', 'anno_clus_dreamorigBB_bioarx', 'anno_clus_dreamorigBB_v2', 'n_genes', 'leiden' var: 'vst.mean', 'vst.variance', 'vst.variance.expected', 'vst.variance.standardized', 'vst.variable', 'n_cells', 'highly_variable', 'means', 'dispersions', 'dispersions_norm' uns: 'log1p', 'hvg', 'pca', 'neighbors', 'diffmap_evals', 'umap', 'leiden', 'paga', 'leiden_sizes', 'leiden_colors', 'anno_clus_dreamorigBB_v2_colors', 'cluster_centroid', 'capital' obsm: 'X_pca', 'X_diffmap', 'X_umap' varm: 'PCs' obsp: 'distances', 'connectivities', adata2=AnnData object with n_obs × n_vars = 14596 × 2000 obs: 'orig.ident', 'nCount_RNA', 'nFeature_RNA', 'gex_barcode', 'atac_barcode', 'is_cell', 'excluded_reason', 'gex_raw_reads', 'gex_mapped_reads', 'gex_conf_intergenic_reads', 'gex_conf_exonic_reads', 'gex_conf_intronic_reads', 'gex_conf_exonic_unique_reads', 'gex_conf_exonic_antisense_reads', 'gex_conf_exonic_dup_reads', 'gex_exonic_umis', 'gex_conf_intronic_unique_reads', 'gex_conf_intronic_antisense_reads', 'gex_conf_intronic_dup_reads', 'gex_intronic_umis', 'gex_conf_txomic_unique_reads', 'gex_umis_count', 'gex_genes_count', 'atac_raw_reads', 'atac_unmapped_reads', 'atac_lowmapq', 'atac_dup_reads', 'atac_chimeric_reads', 'atac_mitochondrial_reads', 'atac_fragments', 'atac_TSS_fragments', 'atac_peak_region_fragments', 'atac_peak_region_cutsites', 'percent.mt', 'nCount_ATAC', 'nFeature_ATAC', 'sex', 'age', 'mitoRatio', 'percent.ribo', 'riboRatio', 'percent.hb', 'log10GenesPerUMI', 'nucleosome_signal', 'nucleosome_percentile', 'TSS.enrichment', 'TSS.percentile', 'pct_reads_in_peaks', 'blacklist_fraction', 'brain', 'brain.bank', 'nCount_SCT', 'nFeature_SCT', 'SCT_snn_res.0.2', 'seurat_clusters', 'ATAC_snn_res.0.2', 'SCT.weight', 'ATAC.weight', 'wsnn_res.0.2', 'RNA_snn_res.0.2', 'pANN_0.25_0.09_1648', 'DF.classifications_0.25_0.09_1648', 'pANN_0.25_0.09_1940', 'DF.classifications_0.25_0.09_1940', 'pANN_0.25_0.09_2285', 'DF.classifications_0.25_0.09_2285', 'pANN_0.25_0.09_1241', 'DF.classifications_0.25_0.09_1241', 'SCT_snn_res.0.4', 'ATAC_snn_res.0.4', 'wsnn_res.0.4', 'm1c_labels_subclass', 'age.group', 'seurat_clusters_origBB', 'm1c_labels_subclass_origBB', 'anno_clus', 'anno_clus_origBB', 'anno_clus_origBB2', 'SCT.dream2BB.weight', 'ATAC.dream2BB.weight', 'seurat_clusters_dream.origBB', 'seurat_clusters_dreamorigBB', 'm1c_labels_subclass.dreamBB', 'anno_clus_dreamBB', 'anno_clus_dreamorigBB', 'OPC_progenitor1', 'OPC_precursor1', 'Oligo_precursor1', 'preOligo1', 'immOligo1', 'matOligo_non1', 'matOligo_mye1', 'TypeI_opc1', 'TypeII_opc1', 'TypeI_Olig1', 'TypeII_Olig1', 'anno_clus_dreamorigBB_bioarx', 'anno_clus_dreamorigBB_v2', 'n_genes', 'leiden' var: 'vst.mean', 'vst.variance', 'vst.variance.expected', 'vst.variance.standardized', 'vst.variable', 'n_cells', 'highly_variable', 'means', 'dispersions', 'dispersions_norm' uns: 'log1p', 'hvg', 'pca', 'neighbors', 'diffmap_evals', 'umap', 'leiden', 'paga', 'leiden_sizes', 'leiden_colors', 'anno_clus_dreamorigBB_v2_colors', 'cluster_centroid', 'capital' obsm: 'X_pca', 'X_diffmap', 'X_umap' varm: 'PCs' obsp: 'distances', 'connectivities', alignedtree=<networkx.classes.digraph.DiGraph object at 0x2af1aebda1c0>, alignmentcost=array([17.519946217]), genes_for_tree_align=array(['ADAM28', 'SYNPO2', 'NIBAN1', 'ITGB2', 'SPTA1', 'CEP295NL', 'MPP4', 'NEMF', 'PCDH11X', 'ARHGAP29', 'KRT222', 'ZEB1', 'KHDRBS2', 'EFCAB8', 'PDYN', 'PPIG', 'C1QL3', 'SLC1A2', 'PKHD1L1', 'HBZ', 'CDH22', 'CD69', 'KAZN', 'OLFM3', 'PRKCB', 'NELL1', 'STAT4', 'NINJ2', 'SYNGR3', 'NPIPB9', 'FSTL5', 'SERPINE1', 'HS6ST3', 'HRH3', 'PRKCH', 'F2RL3', 'UNC5D', 'NLGN4Y', 'DCLK3', 'GABRD', 'MNS1', 'GPR179', 'NDST3', 'EPS8L2', 'SPC24', 'ADGRL4', 'MYH13', 'CNTNAP2', 'VCAN', 'GRM5', 'RASGEF1B', 'OR9Q1', 'GLRA3', 'FSTL4', 'AQP4', 'IL1RL2', 'SPIB', 'MESP2', 'KCTD16', 'SCN3B', 'KCNQ3', 'STAP1', 'GREM2', 'DAND5', 'LDB2', 'GRIK1', 'PDGFRL', 'KCNC2', 'PRPF4B', 'DPP10', 'ADA', 'NEFL', 'SDK1', 'RYR3', 'SV2B', 'ZNF560', 'DYSF', 'RBM25', 'PLCD3', 'MUC19', 'HLA-DRA', 'NRCAM', 'LYN', 'ARHGEF18', 'AFMID', 'OPHN1', 'NRGN', 'ATP1A4', 'CALR3', 'LINGO2', 'MAL', 'CCAR1', 'ARHGAP15', 'MLIP', 'CYGB', 'RDH12', 'RNMT', 'ANXA3', 'CHRNB3', 'ICE1', 'NXPH1', 'HNF1B', 'IL21', 'IWS1', 'CCHCR1', 'RALYL', 'PCDH15', 'FAM222B', 'TMEM221', 'SAMD3', 'RIMS2', 'DNM2', 'FOXP2', 'TRDN', 'HIPK4', 'EVA1C', 'NEUROD2', 'INCA1', 'COL11A1', 'GOLGA7B', 'ABLIM1', 'GABRG2', 'DYRK3', 'ADAM8', 'ROBO2', 'PTCHD4', 'TESMIN', 'SNTG2', 'BCAS1', 'SSTR2', 'RASGRF1', 'SLC32A1', 'ANGPTL5', 'TFR2', 'ISLR2', 'SLC22A8', 'LRMDA', 'PCSK2', 'KLHL6', 'SMOC1', 'KMT2B', 'ST6GALNAC5', 'ATG9B', 'CFAP299', 'FABP6', 'IQSEC3', 'DOC2A', 'PIK3R5', 'CYP4F3', 'SYTL3', 'TM4SF18', 'GPC6', 'RBBP6', 'TMEM275', 'LUC7L', 'ZFAND2A', 'MACC1', 'EFCAB9', 'SLC12A3', 'MAP1B', 'DNAJB1', 'SLC26A4', 'LRRN4CL', 'TDO2', 'SLC4A1', 'ACSBG1', 'DNASE1L3', 'FLI1', 'CLCN1', 'RGS4', 'KIAA1210', 'EPB41L1', 'TAFA2', 'NUTM1', 'TNFRSF14', 'VWA5B1', 'PYHIN1', 'SYT7', 'CHGA', 'RBM20', 'CA6', 'FKBP5', 'GRIP2', 'P2RY8', 'MDFIC', 'CST3', 'PNMA8B', 'M1AP', 'NMT1', 'SH3RF3', 'AGBL4', 'SNCB', 'DGKG', 'TRIM60', 'PHF20', 'PHYHIP', 'NPY', 'CHD5', 'SYT13', 'ZNF385B', 'RBFOX1', 'CYTIP', 'TMEM163', 'SOSTDC1', 'SLC28A1', 'NFAM1', 'SGPP2', 'OR2M3', 'KIAA0040', 'PVALB', 'MAML2', 'TMEM132B', 'CRYBG2', 'GABRE', 'UBE2QL1', 'ITIH1', 'GABPB2', 'CFAP221', 'HTR5A', 'CCDC27', 'TC2N', 'CDH18', 'RAB17', 'KCNMB3', 'DYNC1H1', 'MMP16', 'SPEF1', 'TSGA10', 'GSG1L2', 'ASPG', 'CNTRL', 'GNG7', 'IL17REL', 'HLA-DRB1', 'SRL', 'PDZD2', 'CCK', 'FYB1', 'BST2', 'FGL1', 'MYL5', 'GAD2', 'OR2F1', 'TMEM52B', 'SLC27A1', 'PRR16', 'XKR4', 'RHOXF1', 'IL10RA', 'OR1I1', 'ZNF418', 'ADH1B', 'DNAH3', 'ZNF804B', 'SV2C', 'MEF2C', 'SRGN', 'SEC14L3', 'NOP14', 'FGF14', 'TNNT1', 'TLCD2', 'RHEX', 'IL1RAPL1', 'CDH9', 'HBB', 'CDH12', 'TEX51', 'PARP8', 'CHRNB4', 'HSP90AB1', 'NKAP', 'SORCS2', 'TMEM108', 'TPRX1', 'MCTP2', 'ICAM1', 'EIF5B', 'LEF1', 'TAS1R1', 'KLRF1', 'CELF2', 'IKZF3', 'ASF1B', 'PLEKHG1', 'PTPN3', 'PTPN22', 'LILRB4', 'REST', 'NDUFA4L2', 'LAMA2', 'OR5AS1', 'RGS6', 'ZNF385D', 'NPTX2', 'FSCN2', 'TCOF1', 'RHEBL1', 'NCF2', 'FBXL19', 'WFDC8', 'MMP11', 'CATSPERB', 'LMX1B', 'FTSJ3', 'CHST11', 'PRKG1', 'NRG3', 'LAMC3', 'KIF21A', 'DOCK8', 'SLC22A9', 'PTPRZ1', 'CNTNAP5', 'CPNE6', 'NKAIN3', 'SHROOM3', 'DSCAM', 'PTHLH', 'PRPF38B', 'KCND2', 'DLGAP1', 'KNOP1', 'CD28', 'AQP1', 'KCNIP4', 'CYSLTR2', 'PTPRT', 'PCDH11Y', 'C1orf146', 'TBX18', 'CR1', 'CACNA2D3', 'KCNH4', 'ZFP36', 'OPRD1', 'SAMD5', 'HPSE2', 'GPR17', 'NES', 'RBMS3', 'ZNF155', 'ICAM5', 'CCDC141', 'KCNV1', 'VIT', 'GALNTL6', 'HBA1', 'FRY', 'ZBTB7A', 'PTPRR', 'IL12RB1', 'HSPA1A', 'DYNLRB2', 'CARMIL1', 'PRKCQ', 'PCDH9', 'IQCM', 'GRM7', 'RRBP1', 'MYO18A', 'CT55', 'DLGAP2', 'CLU', 'DPPA2', 'SERPINH1', 'SFMBT2', 'CD247', 'RHBDL2', 'CHRM3', 'ARHGAP26', 'SYT1', 'OPRL1', 'GPR149', 'PDE3B', 'NPM2', 'IFNLR1', 'TCEAL2', 'SLC5A11', 'PDGFRA', 'PCLO', 'ADGRL2', 'CRH', 'CCBE1', 'FCRL1', 'ANKRD18B', 'CEP162', 'GOLGB1', 'SNAP25', 'PNN', 'AK7', 'P2RX1', 'DNAI2', 'SLC47A1', 'FRMPD4', 'CRISP2', 'SLC17A7', 'EPHA6', 'NRXN1', 'OVCH2', 'CARD11', 'GFAP', 'ZBP1', 'DHRSX', 'IGFN1', 'ADGRV1', 'OR3A2', 'KCNH1', 'UPF2', 'SORBS1', 'RP1', 'ITPR2', 'SHISA9', 'KIF23', 'ARHGAP27', 'DGKB', 'SOHLH1', 'SLK', 'NOX3', 'GPR158', 'TNFRSF11B', 'FCHO1', 'RELN', 'CNDP1', 'YWHAH', 'IL1RAPL2', 'FGF12', 'CATSPERD', 'CHODL', 'MEGF11', 'CYP4F12', 'HS3ST2', 'CCDC168', 'PTPRC', 'MINK1', 'HBA2', 'SLC38A11', 'CD83', 'MAML3', 'MEI1', 'TULP2', 'RBPMS2', 'PRF1', 'ARHGAP8', 'PLCXD3', 'VIM', 'SKAP1', 'CTCF', 'FMN1', 'CACNA1A', 'TAFA1', 'C2orf83', 'CD6', 'MYPN', 'PAX5', 'THSD7B', 'ZC3H13', 'KCNIP1', 'CTXN2', 'C11orf87', 'ATRNL1', 'CYB5D1', 'ITIH5', 'SLC4A4', 'HSPH1', 'PCDH8', 'HSPA6', 'CFB', 'IQGAP2', 'GPC5', 'CCDC144A', 'CCL4', 'ST8SIA6', 'CDK15', 'MLXIPL', 'NYAP2', 'SLC35F2', 'CENPP', 'GRAP2', 'BICDL2', 'OCA2', 'CDK5R2', 'PRSS12', 'SIAH3', 'EFNB3', 'C1QTNF2', 'SULT4A1', 'MAL2', 'SLCO1B7', 'RYR2', 'WFDC3', 'C1orf116', 'SNTG1', 'GFRA2', 'EFNA5', 'VRTN', 'CRTAM', 'NKG7', 'BANK1', 'LRFN5', 'ADGRD2', 'GJA1', 'CBFA2T3', 'CHI3L1', 'GRM4', 'PTH', 'CAMK2A', 'GRM1', 'NDNF', 'NPSR1', 'EMILIN1', 'CACNG3', 'CLEC1A', 'CFAP65', 'GNLY', 'B3GNT6', 'SMKR1', 'SERPINA3', 'BAZ1A', 'CHRNA1', 'C1orf115', 'PLCB1', 'PCF11', 'AOAH', 'BRINP3', 'PDZRN4', 'C8orf34', 'UPF3B', 'GGT5', 'TXK', 'CHRNA2', 'TNFAIP8', 'PRXL2B', 'MAP3K6', 'SRRM1', 'EID3', 'OTOF', 'SLC4A5', 'SLC2A13', 'DDX46', 'SHISA8', 'ARGLU1', 'DNAH10', 'SEZ6L', 'PPP1R13L', 'SAMD15', 'MYL9', 'EIF5AL1', 'SLC6A7', 'APOM', 'STXBP2', 'MPZL3', 'GIPC1', 'TPST1', 'RIMS3', 'GRIN2A', 'APOLD1', 'KCNE4', 'SYT4', 'EMB', 'CALN1', 'MTUS2', 'SYT10', 'SPECC1', 'CNR1', 'LTBP1', 'EPB42', 'TP53I11', 'KMT2A', 'HRH2', 'CNTN4', 'NSG1', 'RASA3', 'ACTG2', 'SULT1A4', 'SLC8A1', 'AKAP12', 'SLFN12L', 'TMEM132D', 'TERB1', 'SGCG', 'OFD1', 'EPHB6', 'RIPOR2', 'A4GALT', 'ARL11', 'RBFOX3', 'TINAG', 'SEMA3E', 'TNNT2', 'KCNK9', 'GET4', 'SORCS1', 'ARAP3', 'COL4A1', 'ZC3H12B', 'IL12RB2', 'RIF1', 'RASD2', 'KLF10', 'GPR26', 'TOB2', 'CDH1', 'NSUN6', 'PLA2G4C', 'DCLK1', 'CHSY3', 'SLIT2', 'IKZF1', 'NPTX1', 'FCMR', 'PLA2G4F', 'SLC5A1', 'BAZ1B', 'ANKRD30B', 'SCML4', 'HDDC3', 'SNCG', 'OGFOD3', 'CCDC33', 'BDP1', 'MDN1', 'PROCA1', 'JHY', 'AXDND1', 'NKTR', 'GRIP1', 'RASSF6', 'DAPK2', 'RGS1', 'IL7R', 'PRR14L', 'CACNA1B', 'TRIP11', 'MPHOSPH8', 'TANGO6', 'TOX', 'EZH1', 'LENG8', 'OPCML', 'SYNPR', 'CDK12', 'CCN2', 'DTHD1', 'SCG2', 'TMEM74B', 'PXT1', 'BOD1L1', 'SLC6A4', 'RUNX1', 'ADH4', 'CFAP99', 'ROR1', 'PRICKLE1', 'KCNQ5', 'IL1R2', 'CFAP58', 'PDE6A', 'SVEP1', 'LUC7L3', 'CALY', 'GABRG3', 'TNR', 'ZMAT4', 'CCDC83', 'CACNB2', 'PTPRG', 'GON4L', 'BSPH1', 'MYO9B', 'ACSBG2', 'CALHM2', 'BAIAP2L1', 'TBR1', 'AFF3', 'CHRNB2', 'STXBP5L', 'POTEG', 'ZNF385C', 'CERS4', 'TMEM266', 'NWD1', 'HCN1', 'SST', 'VWA5B2', 'HLA-DPA1', 'CD74', 'PITPNC1', 'CPAMD8', 'SEC14L4', 'C17orf98', 'MEIOB', 'ANKRD36C', 'APBB1IP', 'SAMSN1', 'HLA-DQB1', 'KYNU', 'ZNF804A', 'NYAP1', 'CAMKV', 'DAB1', 'GALNTL5', 'MMD2', 'COL26A1', 'FCN1', 'CDS1', 'SPEN', 'PARD3B', 'EPB41', 'ARHGAP24', 'CXCL14', 'ADAMTSL3', 'VSNL1', 'L1CAM', 'C1orf216', 'FNDC9', 'MS4A1', 'JSRP1', 'FAM20A', 'ALK', 'UST', 'C1QB', 'RNF213', 'CA10', 'CLIP2', 'TENM3', 'PDGFD', 'CEP290', 'CASP8AP2', 'QRICH2', 'TTF2', 'FPR1', 'GREB1L', 'ARGFX', 'XYLT1', 'RNF144B', 'CALB2', 'RGL3', 'GABRB1', 'C1QTNF4', 'TCERG1L', 'ACIN1', 'LUZP2', 'CRX', 'DCT', 'CNN1', 'KLHL4', 'ANKRD18A', 'ESF1', 'NRG1', 'FCRL5', 'SPARCL1', 'THEMIS', 'SGCZ', 'RHAG', 'MYO3A', 'KCNH7', 'MPND', 'RBP5', 'CCSER1', 'OTOP1', 'KIF1A', 'CPLX4', 'EPHB1', 'PTPRN', 'SYT5', 'TENT5B', 'MYO7B', 'PRRC2C', 'SLIT3', 'EBF1', 'STMN2', 'KCNMB2', 'SEC14L5', 'GJA5', 'CEMIP2', 'SOX6', 'TLR8', 'RASAL1', 'TENM2', 'GALNT17', 'USP8', 'FOS', 'LRRTM4', 'CNTN5', 'RXFP1', 'MECOM', 'GABRA4', 'CNGB1', 'ZER1', 'TMIGD2', 'SEMA6D', 'AKR1D1', 'SCARA5', 'TMEM38A', 'UACA', 'STPG2', 'SLC6A17', 'UTRN', 'SLC35F3', 'GABRA1', 'CDH8', 'DISP2', 'SRRM2', 'MGAT4C', 'CBLN2', 'TNXB', 'MAP1A', 'SDS', 'BMPER', 'ZAN', 'AMPD3', 'PPM1N', 'GRIN1', 'TNS3', 'GTF2IRD1', 'GABBR2', 'GRIN3A', 'KIAA1217', 'DEPDC1', 'CXorf58', 'GCC2', 'TAC1', 'HBM', 'EDA', 'COL24A1', 'ITGA4', 'HSP90AA1', 'HTR2A', 'DPH6', 'GPR143', 'BRSK1', 'SLC26A3', 'FRMD4A', 'CCDC110', 'TPR', 'SAMD11'], dtype=object), alignmentdict={'alignment000': {'data1': ['#', '5', '#'], 'data2': ['6', '7', '13']}, 'alignment001': {'data1': ['#', '5', '18', '8', '10'], 'data2': ['6', '7', '#', '#', '4']}, 'alignment002': {'data1': ['#', '5', '18', '8', '14', '#'], 'data2': ['6', '7', '#', '#', '12', '22']}, 'alignment003': {'data1': ['#', '5', '15', '#'], 'data2': ['6', '7', '5', '21']}, 'alignment004': {'data1': ['#', '5', '15', '16', '11'], 'data2': ['6', '7', '5', '#', '19']}, 'alignment005': {'data1': ['#', '5', '15', '16', '9', '6', '#', '0'], 'data2': ['6', '7', '5', '#', '20', '14', '3', '15']}, 'alignment006': {'data1': ['#', '5', '15', '16', '9', '6', '#', '4', '#'], 'data2': ['6', '7', '5', '#', '20', '14', '3', '1', '16']}, 'alignment007': {'data1': ['#', '5', '15', '16', '9', '6', '#', '4', '1', '#'], 'data2': ['6', '7', '5', '#', '20', '14', '3', '1', '0', '18']}, 'alignment008': {'data1': ['#', '5', '15', '16', '9', '6', '#', '4', '1', '3'], 'data2': ['6', '7', '5', '#', '20', '14', '3', '1', '0', '8']}, 'alignment009': {'data1': ['#', '5', '15', '16', '9', '6', '#', '4', '1', '2', '17', '19'], 'data2': ['6', '7', '5', '#', '20', '14', '3', '1', '0', '2', '17', '#']}, 'alignment010': {'data1': ['#', '5', '15', '16', '9', '6', '7'], 'data2': ['6', '7', '5', '#', '20', '14', '11']}, 'alignment011': {'data1': ['#', '5', '15', '13'], 'data2': ['6', '7', '5', '9']}, 'alignment012': {'data1': ['#', '5', '15', '12'], 'data2': ['6', '7', '5', '10']}}, alignmentlist=[('alignment000', ['#', '5', '#'], ['6', '7', '13']), ('alignment001', ['#', '5', '18', '8', '10'], ['6', '7', '#', '#', '4']), ('alignment002', ['#', '5', '18', '8', '14', '#'], ['6', '7', '#', '#', '12', '22']), ('alignment003', ['#', '5', '15', '#'], ['6', '7', '5', '21']), ('alignment004', ['#', '5', '15', '16', '11'], ['6', '7', '5', '#', '19']), ('alignment005', ['#', '5', '15', '16', '9', '6', '#', '0'], ['6', '7', '5', '#', '20', '14', '3', '15']), ('alignment006', ['#', '5', '15', '16', '9', '6', '#', '4', '#'], ['6', '7', '5', '#', '20', '14', '3', '1', '16']), ('alignment007', ['#', '5', '15', '16', '9', '6', '#', '4', '1', '#'], ['6', '7', '5', '#', '20', '14', '3', '1', '0', '18']), ('alignment008', ['#', '5', '15', '16', '9', '6', '#', '4', '1', '3'], ['6', '7', '5', '#', '20', '14', '3', '1', '0', '8']), ('alignment009', ['#', '5', '15', '16', '9', '6', '#', '4', '1', '2', '17', '19'], ['6', '7', '5', '#', '20', '14', '3', '1', '0', '2', '17', '#']), ('alignment010', ['#', '5', '15', '16', '9', '6', '7'], ['6', '7', '5', '#', '20', '14', '11']), ('alignment011', ['#', '5', '15', '13'], ['6', '7', '5', '9']), ('alignment012', ['#', '5', '15', '12'], ['6', '7', '5', '10'])], similarity_score={})

ACC_DLPFC.alignmentlist [('alignment000', ['#', '5', '#'], ['6', '7', '13']), ('alignment001', ['#', '5', '18', '8', '10'], ['6', '7', '#', '#', '4']), ('alignment002', ['#', '5', '18', '8', '14', '#'], ['6', '7', '#', '#', '12', '22']), ('alignment003', ['#', '5', '15', '#'], ['6', '7', '5', '21']), ('alignment004', ['#', '5', '15', '16', '11'], ['6', '7', '5', '#', '19']), ('alignment005', ['#', '5', '15', '16', '9', '6', '#', '0'], ['6', '7', '5', '#', '20', '14', '3', '15']), ('alignment006', ['#', '5', '15', '16', '9', '6', '#', '4', '#'], ['6', '7', '5', '#', '20', '14', '3', '1', '16']), ('alignment007', ['#', '5', '15', '16', '9', '6', '#', '4', '1', '#'], ['6', '7', '5', '#', '20', '14', '3', '1', '0', '18']), ('alignment008', ['#', '5', '15', '16', '9', '6', '#', '4', '1', '3'], ['6', '7', '5', '#', '20', '14', '3', '1', '0', '8']), ('alignment009', ['#', '5', '15', '16', '9', '6', '#', '4', '1', '2', '17', '19'], ['6', '7', '5', '#', '20', '14', '3', '1', '0', '2', '17', '#']), ('alignment010', ['#', '5', '15', '16', '9', '6', '7'], ['6', '7', '5', '#', '20', '14', '11']), ('alignment011', ['#', '5', '15', '13'], ['6', '7', '5', '9']), ('alignment012', ['#', '5', '15', '12'], ['6', '7', '5', '10'])]

cp.tl.dpt(ACC_DLPFC)
cp.tl.dtw(ACC_DLPFC, gene=ACC_DLPFC.genes_for_tree_align, multi_genes=True)

alignment = select alignments of your interest; e.i. aligning clusters of interest

cp.pl.dtw(ACC_DLPFC, gene=["multi_genes"], alignment=["alignment001", "alignment002"],
          data1_name="ACC", data2_name="DLPFC")
main_markers = [
        ["alignment001", "ACTL10"]
     ]

for alignment, gene in main_markers:
    cp.pl.gene_expression_trend(
        ACC_DLPFC, gene=gene, alignment=alignment, fontsize=16, ticksize=16,
        multi_genes=True, switch_psedotime=True,
        data1_name="ACC", data2_name="DLPFC", polyfit_dimension=3
    )

_--------------------------------------------------------------------------- IndexError Traceback (most recent call last) Input In [128], in <cell line: 1>() 1 for alignment, gene in main_markers: ----> 2 cp.pl.gene_expression_trend( 3 ACC_DLPFC, gene=gene, alignment=alignment, fontsize=16, ticksize=16, 4 multi_genes=True, switch_psedotime=True, 5 data1_name="ACC", data2_name="DLPFC", polyfit_dimension=3 6 )

File /sc/arion/projects/CommonMind/tereza/conda/envs/capital/lib/python3.9/site-packages/capital/pl/pl.py:335, in gene_expression_trend(aligned_data, gene, alignment, outliers, polyfit_dimension, switch_psedotime, multi_genes, data1_name, data2_name, data1_color, data2_color, data1_line_color, data2_line_color, ncols, widthspace, heightspace, fontsize, legend_fontsize, ticksize, dpi, show, save) 331 else: 332 pseudotime = data1[ordered_cells1, :].obs["{}_dptpseudotime".format( 333 alignment)][[i for i, in path]].values --> 335 data1_expressionlevel = expression1[[i for i, in path]] 336 data2_expressionlevel = expression2[[j for , j in path]] 338 array = np.array( 339 [pseudotime, data1_expression_level, data2_expression_level])

File /sc/arion/projects/CommonMind/tereza/conda/envs/capital/lib/python3.9/site-packages/scipy/sparse/_index.py:47, in IndexMixin.getitem(self, key) 46 def getitem(self, key): ---> 47 row, col = self._validate_indices(key) 49 # Dispatch to specialized methods. 50 if isinstance(row, INT_TYPES):

File /sc/arion/projects/CommonMind/tereza/conda/envs/capital/lib/python3.9/site-packages/scipy/sparse/_index.py:159, in IndexMixin._validate_indices(self, key) 157 row += M 158 elif not isinstance(row, slice): --> 159 row = self._asindices(row, M) 161 if isintlike(col): 162 col = int(col)

File /sc/arion/projects/CommonMind/tereza/conda/envs/capital/lib/python3.9/site-packages/scipy/sparse/_index.py:191, in IndexMixin._asindices(self, idx, length) 189 max_indx = x.max() 190 if max_indx >= length: --> 191 raise IndexError('index (%d) out of range' % max_indx) 193 min_indx = x.min() 194 if min_indx < 0:

IndexError: index (2663) out of range_

--------- calculate genes with high expression profile similarity

cp.tl.genes_similarity_score(ACC_DLPFC, alignment="alignment001", min_disp=0.5) ACC_DLPFC.similarity_score["alignment001"] array(['ACTL10', 'MUC6', 'ASGR2', ..., 'GPC5', 'NLGN4Y', 'SGCZ'], dtype=object)

session_info.show()

anndata 0.8.0 capital 1.0.14 matplotlib 3.5.2 networkx 2.8.2 numpy 1.23.5 pandas 1.3.5 scanpy 1.9.1 session_info 1.0.0

Click to view modules imported as dependencies

IPython 8.4.0 jupyter_client 7.2.2 jupyter_core 4.10.0

Python 3.9.15 | packaged by conda-forge | (main, Nov 22 2022, 15:55:03) [GCC 10.4.0] Linux-3.10.0-1160.el7.x86_64-x86_64-with-glibc2.33

Session information updated at 2023-02-17 10:12

Can you please navigate me what is wrong?

Thank you!

Best regards, Tereza

Rsugihara01 commented 1 year ago

Hi, @TerezaClarence Thanks for contacting us again! I did cheak the code, but not really sure what is happening, because the code should have raised some other errors if dynamic time warping have failed in the first place.

Can you try adding '.copy()' at the end of the code that split the anndata? anndata sometime act weird when they are the copy of anndata. '.copy()' deepcopy and create new anndata and that might clear the problems.

opc_oligACC = adata1[adata1.obs['brain'].isin(['ACC'])].copy()
opc_oligDLPFC = adata1[adata1.obs['brain'].isin(['DLPFC'])].copy()

If that did not work, would you plot some genes that exist in ACC_DLPFC.genes_for_tree_align like 'ADAM28'?

Also can you try the codes below and send the result to us ? It would be a great help to check what is going on.

multi_genes = ACC_DLPFC.alignmentdict["alignment001"]["multi_genes"]
print(multi_genes["path"])
print(f"length of path : {len(multi_genes["path"][0])}")
print(f"shape of data1 : {opc_oligACC.adata1.raw.to_adata().X.shape} ")
print(f"shape of data2 : {opc_oligDLPFC.adata1.raw.to_adata().X.shape} ")

Best regard Rsugihara01

TerezaClarence commented 1 year ago

Dear @Rsugihara01 ,

I repeated the analysis with your suggestion *.copy() but it didn't help, please see below including the code you suggested me to run:

--- get alignment list of aligned clusters

ACC_DLPFC.alignmentlist [('alignment000', ['5', '#', '#'], ['6', '7', '5']), ('alignment001', ['5', '#', '#', '#'], ['6', '7', '4', '11']), ('alignment002', ['5', '#', '#', '#'], ['6', '7', '4', '22']), ('alignment003', ['5', '#', '#', '#'], ['6', '7', '4', '20']), ('alignment004', ['5', '#', '#', '7'], ['6', '7', '4', '10']), ('alignment005', ['5', '#', '#', '2', '1', '4'], ['6', '7', '4', '21', '14', '12']), ('alignment006', ['5', '#', '#', '2', '1', '3', '14', '#'], ['6', '7', '4', '21', '14', '3', '1', '19']), ('alignment007', ['5', '#', '#', '2', '1', '3', '14', '15', '12'], ['6', '7', '4', '21', '14', '3', '1', '2', '9']), ('alignment008', ['5', '#', '#', '2', '1', '3', '14', '15', '6', '17', '8', '13'], ['6', '7', '4', '21', '14', '3', '1', '2', '0', '18', '#', '#']), ('alignment009', ['5', '#', '#', '2', '1', '3', '14', '15', '6', '17', '8', '9'], ['6', '7', '4', '21', '14', '3', '1', '2', '0', '18', '#', '#']), ('alignment010', ['5', '#', '#', '2', '1', '3', '14', '15', '11'], ['6', '7', '4', '21', '14', '3', '1', '2', '8']), ('alignment011', ['5', '#', '#', '2', '1', '3', '14', '10'], ['6', '7', '4', '21', '14', '3', '1', '17']), ('alignment012', ['5', '#', '#', '2', '1', '3', '16', '18'], ['6', '7', '4', '21', '14', '3', '#', '15']), ('alignment013', ['5', '#', '#', '#', '0'], ['6', '7', '16', '13', '23'])]

ACC_DLPFC.genes_for_tree_align array(['JHY', 'CLEC1A', 'NOX3', 'OR2F1', 'CYP4F3', 'PROCA1', 'NLGN4Y', 'CDK15', 'MYO3A', 'KCNH1', 'PCF11', 'SPTA1', 'CD28', 'CYB5D1', 'GPR26', 'KMT2A', 'UST', 'TPR', 'SEZ6L', 'SEC14L4', 'KCTD16', 'GRAP2', 'PLCB1', 'VCAN', 'EBF1', 'CHSY3', 'RASA3', 'SLC38A11', 'ASPG', 'C8orf34', 'CACNA1A', 'KAZN', 'FGF14', 'RIMS3', 'RGS1', 'MPND', 'CT55', 'CRH', 'TCOF1', 'CCK', 'VIM', 'PXT1', 'FGL1', 'GRM7', 'HBA2', 'RRBP1', 'NIBAN1', 'SLC32A1', 'PRXL2B', 'ANGPTL5', 'CDH22', 'KCND2', 'LRMDA', 'CCBE1', 'ZAN', 'TMEM74B', 'COL24A1', 'FYB1', 'REST', 'KIF1A', 'FRMPD4', 'ICAM5', 'PHF20', 'GABRD', 'DNAI2', 'RBMS3', 'SORBS1', 'SCML4', 'BAZ1A', 'PLEKHG1', 'SLCO1B7', 'MEF2C', 'CXCL14', 'CFAP58', 'RASAL1', 'IQGAP2', 'ROBO2', 'KCNE4', 'SMKR1', 'SST', 'PARP8', 'MYH13', 'IL7R', 'PITPNC1', 'NEUROD2', 'NDNF', 'RHEBL1', 'SLC26A3', 'CBFA2T3', 'BAZ1B', 'ZNF385D', 'GAD2', 'KYNU', 'RASGEF1B', 'GRM5', 'MYO9B', 'CD247', 'CHST11', 'SLC47A1', 'PTH', 'CHRNA2', 'CELF2', 'FNDC9', 'PRPF4B', 'COL26A1', 'LAMA2', 'PPIG', 'SDK1', 'VRTN', 'TENM2', 'SV2B', 'CDK12', 'THSD7B', 'SPC24', 'SULT4A1', 'CHODL', 'CENPP', 'PHYHIP', 'CYP4F12', 'FOXP2', 'CNDP1', 'ALK', 'CAMK2A', 'ZNF155', 'AOAH', 'BST2', 'MS4A1', 'TMEM221', 'GRIN1', 'C1QTNF2', 'ARHGAP24', 'FCRL5', 'OR5AS1', 'SLC6A17', 'DCT', 'SLC27A1', 'EFCAB8', 'TMEM52B', 'EPB41', 'ESF1', 'STXBP2', 'DAB1', 'TBR1', 'UNC5D', 'ARGLU1', 'EFNA5', 'PDE3B', 'FRY', 'CD74', 'ZBTB7A', 'GPR149', 'GPR17', 'TNFRSF11B', 'ST8SIA6', 'ZEB1', 'CYGB', 'CCN2', 'PTPN22', 'ARHGAP29', 'CRISP2', 'GPR143', 'USP8', 'CXorf58', 'MYO7B', 'BCAS1', 'TAFA1', 'KMT2B', 'XKR4', 'SNTG1', 'FAM20A', 'RALYL', 'CRYBG2', 'SLC35F2', 'GNLY', 'PCSK2', 'AGBL4', 'TNFRSF14', 'UACA', 'PTPRN', 'ADGRL2', 'TDO2', 'SYT13', 'KIAA0040', 'ITIH1', 'COL11A1', 'TENM3', 'TERB1', 'ATRNL1', 'GJA5', 'KCNMB2', 'CHGA', 'RBBP6', 'EFCAB9', 'SOSTDC1', 'TMEM132D', 'TLCD2', 'MPZL3', 'SH3RF3', 'SYNPR', 'DCLK3', 'TAS1R1', 'SYT5', 'NPIPB9', 'GON4L', 'SGPP2', 'RAB17', 'NFAM1', 'KLHL6', 'EPHA6', 'LAMC3', 'KLF10', 'NRG1', 'ABLIM1', 'TCEAL2', 'TNR', 'OPRD1', 'KIAA1210', 'WFDC8', 'CNR1', 'ATP1A4', 'SNCB', 'HSP90AA1', 'KCNIP1', 'OR2M3', 'FAM222B', 'STPG2', 'MDFIC', 'TAFA2', 'HSPH1', 'DDX46', 'NPSR1', 'SRL', 'SHISA8', 'ST6GALNAC5', 'AKR1D1', 'TCERG1L', 'MAL', 'NKAIN3', 'PVALB', 'PRRC2C', 'SPARCL1', 'PLA2G4C', 'GALNTL6', 'BDP1', 'ZNF804A', 'GABRB1', 'OCA2', 'UTRN', 'RYR3', 'GRM4', 'DNAJB1', 'NPY', 'MMP16', 'C11orf87', 'PRSS12', 'TPST1', 'HRH3', 'EPHB1', 'PTCHD4', 'POTEG', 'CYTIP', 'EMB', 'NUTM1', 'SPIB', 'GTF2IRD1', 'MYO18A', 'GPC5', 'CALR3', 'CCDC33', 'SYT1', 'DCLK1', 'CCDC27', 'CEMIP2', 'P2RY8', 'IFNLR1', 'RBM25', 'PTPRC', 'IL10RA', 'SPEN', 'GREB1L', 'NPTX2', 'CCSER1', 'PDYN', 'CD6', 'DNAH3', 'CDH9', 'PTPRG', 'MECOM', 'ZNF418', 'CD69', 'CCDC144A', 'CFAP65', 'ADA', 'PRICKLE1', 'AFMID', 'PAX5', 'PRKCH', 'RGS4', 'PDGFRA', 'CLIP2', 'HPSE2', 'SPECC1', 'CNTN4', 'PDZRN4', 'TBX18', 'SLC28A1', 'SLC6A4', 'CEP295NL', 'CATSPERB', 'SOX6', 'CDS1', 'CDH18', 'FCRL1', 'KCNMB3', 'HLA-DRB1', 'TC2N', 'PCDH15', 'CCHCR1', 'UPF3B', 'SERPINH1', 'RHOXF1', 'LENG8', 'ARAP3', 'NELL1', 'GFRA2', 'SIAH3', 'C2orf83', 'IL12RB1', 'MCTP2', 'HDDC3', 'CPAMD8', 'AFF3', 'MTUS2', 'SYT10', 'ACSBG1', 'GABPB2', 'GFAP', 'ACTG2', 'OFD1', 'SLIT2', 'ZFP36', 'FPR1', 'CTCF', 'EVA1C', 'MAP3K6', 'ARGFX', 'SMOC1', 'EDA', 'TRIP11', 'RIF1', 'PTPRR', 'FMN1', 'LUC7L3', 'KCNK9', 'EPS8L2', 'CDH12', 'TMEM38A', 'TFR2', 'SNCG', 'CNGB1', 'PPP1R13L', 'GABRA1', 'KCNIP4', 'CCDC141', 'HS6ST3', 'IKZF3', 'SLC26A4', 'SLC17A7', 'ANKRD30B', 'GABRG2', 'DNAH10', 'BRSK1', 'GRIP2', 'MYL9', 'DPH6', 'FSTL5', 'PPM1N', 'UBE2QL1', 'VSNL1', 'ADGRL4', 'TULP2', 'SNTG2', 'CHRNB2', 'CDH8', 'PCDH11Y', 'AMPD3', 'PRKCB', 'ADH4', 'DGKB', 'VWA5B2', 'C17orf98', 'BMPER', 'GALNT17', 'PKHD1L1', 'CA10', 'MNS1', 'LYN', 'GPR179', 'RHEX', 'CNTN5', 'HLA-DPA1', 'UPF2', 'ADGRV1', 'TMEM163', 'ADAM28', 'RBFOX3', 'TMEM275', 'GRIN2A', 'SLC4A5', 'RNMT', 'SLC22A8', 'EPHB6', 'KHDRBS2', 'ADAM8', 'PTHLH', 'NSUN6', 'SYT4', 'CACNA2D3', 'RGS6', 'PIK3R5', 'HSPA1A', 'GOLGB1', 'MYPN', 'STAT4', 'DLGAP1', 'ADGRD2', 'RDH12', 'RHAG', 'TEX51', 'GABRE', 'SGCG', 'CNN1', 'ARL11', 'RASSF6', 'SRRM2', 'A4GALT', 'SLC1A2', 'HBM', 'SYTL3', 'CST3', 'RNF213', 'SDS', 'FLI1', 'SEMA3E', 'NEFL', 'BICDL2', 'LILRB4', 'ANXA3', 'NRGN', 'SAMD11', 'MACC1', 'CALB2', 'MMP11', 'CHRNB4', 'SLC2A13', 'ARHGAP26', 'KIF21A', 'PTPRZ1', 'MAP1A', 'RBFOX1', 'GIPC1', 'SORCS1', 'HSPA6', 'TENT5B', 'SKAP1', 'DSCAM', 'GNG7', 'EPB41L1', 'OPRL1', 'HS3ST2', 'BSPH1', 'CASP8AP2', 'PCDH8', 'OTOF', 'RGL3', 'ZER1', 'IL1RL2', 'PTPN3', 'SSTR2', 'KLHL4', 'HBZ', 'PARD3B', 'NINJ2', 'ISLR2', 'TNFAIP8', 'DYNC1H1', 'CEP290', 'NPM2', 'AXDND1', 'CHRM3', 'SHISA9', 'DAPK2', 'ITGA4', 'PRR16', 'SHROOM3', 'NDST3', 'ANKRD36C', 'JSRP1', 'LDB2', 'KCNC2', 'CARMIL1', 'VWA5B1', 'EID3', 'CLCN1', 'CHI3L1', 'PRKCQ', 'TLR8', 'ARHGEF18', 'MEI1', 'CTXN2', 'AK7', 'SLC4A4', 'RIPOR2', 'HSP90AB1', 'CDK5R2', 'RELN', 'GALNTL5', 'PDE6A', 'SRGN', 'RBM20', 'MMD2', 'NRG3', 'C1orf115', 'AQP1', 'XYLT1', 'CCDC83', 'IL21', 'LUC7L', 'EZH1', 'GCC2', 'SYNPO2', 'DYRK3', 'CNTRL', 'SEC14L5', 'CLU', 'SLC4A1', 'DPPA2', 'ZNF385C', 'CALHM2', 'TOX', 'TINAG', 'GLRA3', 'LTBP1', 'CACNA1B', 'GJA1', 'CFAP299', 'FTSJ3', 'IQCM', 'CRX', 'IL1RAPL1', 'PRKG1', 'SAMSN1', 'DLGAP2', 'PLCXD3', 'CACNB2', 'SCN3B', 'PDGFD', 'LUZP2', 'CR1', 'EIF5B', 'DEPDC1', 'SAMD3', 'OR1I1', 'GABRA4', 'ICAM1', 'HLA-DRA', 'TMEM132B', 'CFAP99', 'OR3A2', 'SLC5A1', 'ITGB2', 'CALY', 'RBPMS2', 'C1QB', 'SNAP25', 'ZC3H13', 'STMN2', 'C1orf216', 'HBA1', 'COL4A1', 'CEP162', 'SCARA5', 'SLC22A9', 'HIPK4', 'TP53I11', 'CAMKV', 'PNMA8B', 'MEGF11', 'P2RX1', 'SLC8A1', 'MLXIPL', 'L1CAM', 'RUNX1', 'C1orf116', 'OPHN1', 'DYNLRB2', 'FCHO1', 'RASD2', 'CNTNAP2', 'C1QL3', 'CCDC110', 'ARHGAP27', 'NXPH1', 'TTF2', 'HLA-DQB1', 'KIF23', 'B3GNT6', 'KCNQ5', 'FBXL19', 'ADH1B', 'CYSLTR2', 'MAML2', 'BRINP3', 'AKAP12', 'PDGFRL', 'SLFN12L', 'GABBR2', 'IGFN1', 'GRIN3A', 'CFAP221', 'MEIOB', 'DHRSX', 'ANKRD18B', 'ROR1', 'CPLX4', 'HCN1', 'SULT1A4', 'YWHAH', 'ZFAND2A', 'CATSPERD', 'NSG1', 'DPP10', 'ATG9B', 'APBB1IP', 'GGT5', 'EPB42', 'SLK', 'CD83', 'MYL5', 'CCL4', 'CA6', 'ACSBG2', 'IL17REL', 'PDZD2', 'TNNT1', 'NYAP2', 'NPTX1', 'M1AP', 'SLIT3', 'SLC12A3', 'TRDN', 'VIT', 'SYT7', 'MPP4', 'PRR14L', 'NCF2', 'TMIGD2', 'TESMIN', 'GABRG3', 'EMILIN1', 'TANGO6', 'ZNF385B', 'RXFP1', 'FKBP5', 'GPC6', 'DOC2A', 'IKZF1', 'ASF1B', 'DGKG', 'LRRN4CL', 'DAND5', 'DNASE1L3', 'ZBP1', 'TMEM108', 'F2RL3', 'SYNGR3', 'FCMR', 'HRH2', 'NKTR', 'OLFM3', 'QRICH2', 'PCLO', 'EFNB3', 'HNF1B', 'MLIP', 'TPRX1', 'NWD1', 'LINGO2', 'KCNH4', 'SERPINE1', 'KLRF1', 'NKG7', 'SLC6A7', 'BANK1', 'EIF5AL1', 'KNOP1', 'DTHD1', 'ZNF560', 'GREM2', 'RASGRF1', 'KCNH7', 'MINK1', 'FGF12', 'MPHOSPH8', 'FSCN2', 'SV2C', 'FABP6', 'RIMS2', 'BOD1L1', 'TNXB', 'SLC35F3', 'SEMA6D', 'OVCH2', 'ACIN1', 'KIAA1217', 'LRFN5', 'PYHIN1', 'LMX1B', 'MAL2', 'WFDC3', 'NRXN1', 'INCA1', 'DOCK8', 'SOHLH1', 'LEF1', 'ITIH5', 'IL12RB2', 'GOLGA7B', 'SAMD15', 'TSGA10', 'STAP1', 'RP1', 'HTR2A', 'IQSEC3', 'OR9Q1', 'C1QTNF4', 'CHRNA1', 'CBLN2', 'PCDH11X', 'DNM2', 'TRIM60', 'APOM', 'ANKRD18A', 'CHD5', 'KCNQ3', 'GRIK1', 'MGAT4C', 'IWS1', 'CFB', 'SLC5A11', 'DISP2', 'SFMBT2', 'SGCZ', 'CCAR1', 'PTPRT', 'PLA2G4F', 'RHBDL2', 'SCG2', 'STXBP5L', 'IL1R2', 'TNS3', 'ICE1', 'CALN1', 'HBB', 'C1orf146', 'MAML3', 'CACNG3', 'ZC3H12B', 'NES', 'TAC1', 'MAP1B', 'FRMD4A', 'SORCS2', 'NEMF', 'NRCAM', 'CERS4', 'AQP4', 'SVEP1', 'TMEM266', 'GPR158', 'PLCD3', 'FOS', 'TXK', 'NKAP', 'OTOP1', 'CPNE6', 'APOLD1', 'FSTL4', 'GSG1L2', 'NDUFA4L2', 'CHRNB3', 'ITPR2', 'ARHGAP8', 'MDN1', 'CDH1', 'RNF144B', 'PNN', 'OPCML', 'TM4SF18', 'IL1RAPL2', 'SRRM1', 'MUC19', 'DYSF', 'MESP2', 'CRTAM', 'PRF1', 'ARHGAP15', 'GET4', 'BAIAP2L1', 'RYR2', 'NYAP1', 'OGFOD3', 'NMT1', 'THEMIS', 'HTR5A', 'ZNF804B', 'SERPINA3', 'ADAMTSL3', 'SEC14L3', 'KRT222', 'LRRTM4', 'TOB2', 'SAMD5', 'GRIP1', 'CNTNAP5', 'FCN1', 'CCDC168', 'SPEF1', 'NOP14', 'PCDH9', 'ZMAT4', 'GRM1', 'KCNV1', 'PRPF38B', 'TNNT2', 'RBP5', 'CARD11'], dtype=object)

str_match = [s for s in ACC_DLPFC.genes_for_tree_align if "ADAM28" in s]
print(str_match)

['ADAM28']

print(multi_genes["path"][0])
print(len(multi_genes["path"][0]))
#print(f"length of path : {len(multi_genes["path"][0])}")

(0, 0) 2

print(f"shape of data1 : {opc_oligACC.raw.to_adata().X.shape} ") shape of data1 : (15527, 15785)

print(f"shape of data2 : {opc_oligDLPFC.raw.to_adata().X.shape} ") shape of data2 : (14596, 15601)

main_markers = [
        ["alignment001", "ADAM28"]
     ]
for alignment, gene in main_markers:
    cp.pl.gene_expression_trend(
        ACC_DLPFC, gene=gene, alignment=alignment, fontsize=16, ticksize=16,
        multi_genes=True, switch_psedotime=True,
        data1_name="ACC", data2_name="DLPFC", polyfit_dimension=3
    )

IndexError Traceback (most recent call last) Input In [32], in <cell line: 1>() 1 for alignment, gene in main_markers: ----> 2 cp.pl.gene_expression_trend( 3 ACC_DLPFC, gene=gene, alignment=alignment, fontsize=16, ticksize=16, 4 multi_genes=True, switch_psedotime=True, 5 data1_name="ACC", data2_name="DLPFC", polyfit_dimension=3 6 )

File /sc/arion/projects/CommonMind/tereza/conda/envs/capital/lib/python3.9/site-packages/capital/pl/pl.py:335, in gene_expression_trend(aligned_data, gene, alignment, outliers, polyfit_dimension, switch_psedotime, multi_genes, data1_name, data2_name, data1_color, data2_color, data1_line_color, data2_line_color, ncols, widthspace, heightspace, fontsize, legend_fontsize, ticksize, dpi, show, save) 331 else: 332 pseudotime = data1[ordered_cells1, :].obs["{}_dptpseudotime".format( 333 alignment)][[i for i, in path]].values --> 335 data1_expressionlevel = expression1[[i for i, in path]] 336 data2_expressionlevel = expression2[[j for , j in path]] 338 array = np.array( 339 [pseudotime, data1_expression_level, data2_expression_level])

File /sc/arion/projects/CommonMind/tereza/conda/envs/capital/lib/python3.9/site-packages/scipy/sparse/_index.py:47, in IndexMixin.getitem(self, key) 46 def getitem(self, key): ---> 47 row, col = self._validate_indices(key) 49 # Dispatch to specialized methods. 50 if isinstance(row, INT_TYPES):

File /sc/arion/projects/CommonMind/tereza/conda/envs/capital/lib/python3.9/site-packages/scipy/sparse/_index.py:159, in IndexMixin._validate_indices(self, key) 157 row += M 158 elif not isinstance(row, slice): --> 159 row = self._asindices(row, M) 161 if isintlike(col): 162 col = int(col)

File /sc/arion/projects/CommonMind/tereza/conda/envs/capital/lib/python3.9/site-packages/scipy/sparse/_index.py:191, in IndexMixin._asindices(self, idx, length) 189 max_indx = x.max() 190 if max_indx >= length: --> 191 raise IndexError('index (%d) out of range' % max_indx) 193 min_indx = x.min() 194 if min_indx < 0:

IndexError: index (1377) out of range

I really want to use CAPITAL and I believe it's great easy to use tool, it's just such a pity I have been having trouble to plot anything :( I would be grateful for any suggestions!

Best, Tereza

Rsugihara01 commented 1 year ago

Hi @TerezaClarence

Sorry for not replying sooner.

It seems like the calculation of dynamic time warping is not working. ACC_DLPFC.alignmentdict["alignment001"]["multi_genes"]["path"] contains the result of dynamic time warping, but your result showed 0 and does not contain anything.

Maybe calculation of pseudotime went something wrong. Can you check the pseudotime does not contain any inf or nan?

ad1 = ACC_DLPFC.adata1[ACC_DLPFC.adata1.obs["leiden"].isin([ '5', '18', '8', '10'])]
print(f" contains any nan : {ad1.obs["alignment001_dpt_pseudotime"].isnull().any()}")
print(f" count of inf :  {np.isinf(ad1.obs["alignment001_dpt_pseudotime"]).values.sum()}")

ad2 = ACC_DLPFC.adata2[ACC_DLPFC.adata2.obs["leiden"].isin([ '6', '7',  '4'])]
print(f" contains any nan : {ad2.obs["alignment001_dpt_pseudotime"].isnull().any()}")
print(f" count of inf :  {np.isinf(ad2.obs["alignment001_dpt_pseudotime"]).values.sum()}")

Also can I see what is stored in adata.raw? CAPITAL uses raw to store original datasets and sometimes use those. if number of cells or genes of raw does not match those of adata, you need match them.

ACC_DLPFC.adata1.raw.to_adata()
ACC_DLPFC.adata2.raw.to_adata()

We are really happy to hear you our CAPITAL and hope that it helps your research! Sorry for taking your time, we will do our best to get it working.

Best, Rsugihara01

TerezaClarence commented 1 year ago

Dear @Rsugihara01 ,

please see the output:

ad1 = ACC_DLPFC.adata1[ACC_DLPFC.adata1.obs["leiden"].isin([ '5', '18', '8', '10'])]
print(ad1.obs["alignment001_dpt_pseudotime"].isnull().any())
ad1.obs["alignment001_dpt_pseudotime"]

_True
ACC_4413_ACGAGTAAGCGGATAA-1         NaN
ACC_4413_CGCTTACTCGGTTTCC-1         NaN
ACC_4413_GATAACGAGGCCGGAA-1         NaN
ACC_4413_GATTCAGGTTAGTTGG-1         NaN
ACC_4413_GTGCCTTTCTAGCGTG-1         NaN
                                 ...   
ACC_5977_TTTCGTCCAGCCAGAA-1         NaN
ACC_5977_TTTGACTTCGCAGGCT-1         NaN
ACC_5977_TTTGCATTCCCATAGG-1         NaN
ACC_5977_TTTGGTAAGTTAGCCG-1    0.598437
ACC_5977_TTTGTCCCAGCAAATA-1    0.615842
Name: alignment001_dpt_pseudotime, Length: 2719, dtype: float32_

print(np.isinf(ad1.obs["alignment001_dpt_pseudotime"]).values.sum())
np.isinf(ad1.obs["alignment001_dpt_pseudotime"]).values.sum()

_0
0_
ad2 = ACC_DLPFC.adata2[ACC_DLPFC.adata2.obs["leiden"].isin([ '6', '7',  '4'])]
print(ad2.obs["alignment001_dpt_pseudotime"].isnull().any())
ad2.obs["alignment001_dpt_pseudotime"].isnull().any()

_False
False_

print(np.isinf(ad2.obs["alignment001_dpt_pseudotime"]).values.sum())
np.isinf(ad2.obs["alignment001_dpt_pseudotime"]).values.sum()

_0
0_

and then for raw

ACC_DLPFC.adata1.raw.to_adata()

_AnnData object with n_obs × n_vars = 15527 × 15785
    obs: 'orig.ident', 'nCount_RNA', 'nFeature_RNA', 'gex_barcode', 'atac_barcode', 'is_cell', 'excluded_reason', 'gex_raw_reads', 'gex_mapped_reads', 'gex_conf_intergenic_reads', 'gex_conf_exonic_reads', 'gex_conf_intronic_reads', 'gex_conf_exonic_unique_reads', 'gex_conf_exonic_antisense_reads', 'gex_conf_exonic_dup_reads', 'gex_exonic_umis', 'gex_conf_intronic_unique_reads', 'gex_conf_intronic_antisense_reads', 'gex_conf_intronic_dup_reads', 'gex_intronic_umis', 'gex_conf_txomic_unique_reads', 'gex_umis_count', 'gex_genes_count', 'atac_raw_reads', 'atac_unmapped_reads', 'atac_lowmapq', 'atac_dup_reads', 'atac_chimeric_reads', 'atac_mitochondrial_reads', 'atac_fragments', 'atac_TSS_fragments', 'atac_peak_region_fragments', 'atac_peak_region_cutsites', 'percent.mt', 'nCount_ATAC', 'nFeature_ATAC', 'sex', 'age', 'mitoRatio', 'percent.ribo', 'riboRatio', 'percent.hb', 'log10GenesPerUMI', 'nucleosome_signal', 'nucleosome_percentile', 'TSS.enrichment', 'TSS.percentile', 'pct_reads_in_peaks', 'blacklist_fraction', 'brain', 'brain.bank', 'nCount_SCT', 'nFeature_SCT', 'SCT_snn_res.0.2', 'seurat_clusters', 'ATAC_snn_res.0.2', 'SCT.weight', 'ATAC.weight', 'wsnn_res.0.2', 'RNA_snn_res.0.2', 'pANN_0.25_0.09_1648', 'DF.classifications_0.25_0.09_1648', 'pANN_0.25_0.09_1940', 'DF.classifications_0.25_0.09_1940', 'pANN_0.25_0.09_2285', 'DF.classifications_0.25_0.09_2285', 'pANN_0.25_0.09_1241', 'DF.classifications_0.25_0.09_1241', 'SCT_snn_res.0.4', 'ATAC_snn_res.0.4', 'wsnn_res.0.4', 'm1c_labels_subclass', 'age.group', 'seurat_clusters_origBB', 'm1c_labels_subclass_origBB', 'anno_clus', 'anno_clus_origBB', 'anno_clus_origBB2', 'SCT.dream2BB.weight', 'ATAC.dream2BB.weight', 'seurat_clusters_dream.origBB', 'seurat_clusters_dreamorigBB', 'm1c_labels_subclass.dreamBB', 'anno_clus_dreamBB', 'anno_clus_dreamorigBB', 'OPC_progenitor1', 'OPC_precursor1', 'Oligo_precursor1', 'preOligo1', 'immOligo1', 'matOligo_non1', 'matOligo_mye1', 'TypeI_opc1', 'TypeII_opc1', 'TypeI_Olig1', 'TypeII_Olig1', 'anno_clus_dreamorigBB_bioarx', 'anno_clus_dreamorigBB_v2', 'n_genes', 'leiden', 'alignment000_dpt_pseudotime', 'alignment001_dpt_pseudotime', 'alignment002_dpt_pseudotime', 'alignment003_dpt_pseudotime', 'alignment004_dpt_pseudotime', 'alignment005_dpt_pseudotime', 'alignment006_dpt_pseudotime', 'alignment007_dpt_pseudotime', 'alignment008_dpt_pseudotime', 'alignment009_dpt_pseudotime', 'alignment010_dpt_pseudotime', 'alignment011_dpt_pseudotime', 'alignment012_dpt_pseudotime', 'alignment013_dpt_pseudotime'
    var: 'vst.mean', 'vst.variance', 'vst.variance.expected', 'vst.variance.standardized', 'vst.variable', 'n_cells', 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
    uns: 'log1p', 'hvg', 'pca', 'neighbors', 'diffmap_evals', 'umap', 'leiden', 'paga', 'leiden_sizes', 'leiden_colors', 'anno_clus_dreamorigBB_v2_colors', 'cluster_centroid', 'capital', 'root_cell'
    obsm: 'X_pca', 'X_diffmap', 'X_umap'
    obsp: 'distances', 'connectivities'_

ACC_DLPFC.adata2.raw.to_adata()

_AnnData object with n_obs × n_vars = 14596 × 15601
    obs: 'orig.ident', 'nCount_RNA', 'nFeature_RNA', 'gex_barcode', 'atac_barcode', 'is_cell', 'excluded_reason', 'gex_raw_reads', 'gex_mapped_reads', 'gex_conf_intergenic_reads', 'gex_conf_exonic_reads', 'gex_conf_intronic_reads', 'gex_conf_exonic_unique_reads', 'gex_conf_exonic_antisense_reads', 'gex_conf_exonic_dup_reads', 'gex_exonic_umis', 'gex_conf_intronic_unique_reads', 'gex_conf_intronic_antisense_reads', 'gex_conf_intronic_dup_reads', 'gex_intronic_umis', 'gex_conf_txomic_unique_reads', 'gex_umis_count', 'gex_genes_count', 'atac_raw_reads', 'atac_unmapped_reads', 'atac_lowmapq', 'atac_dup_reads', 'atac_chimeric_reads', 'atac_mitochondrial_reads', 'atac_fragments', 'atac_TSS_fragments', 'atac_peak_region_fragments', 'atac_peak_region_cutsites', 'percent.mt', 'nCount_ATAC', 'nFeature_ATAC', 'sex', 'age', 'mitoRatio', 'percent.ribo', 'riboRatio', 'percent.hb', 'log10GenesPerUMI', 'nucleosome_signal', 'nucleosome_percentile', 'TSS.enrichment', 'TSS.percentile', 'pct_reads_in_peaks', 'blacklist_fraction', 'brain', 'brain.bank', 'nCount_SCT', 'nFeature_SCT', 'SCT_snn_res.0.2', 'seurat_clusters', 'ATAC_snn_res.0.2', 'SCT.weight', 'ATAC.weight', 'wsnn_res.0.2', 'RNA_snn_res.0.2', 'pANN_0.25_0.09_1648', 'DF.classifications_0.25_0.09_1648', 'pANN_0.25_0.09_1940', 'DF.classifications_0.25_0.09_1940', 'pANN_0.25_0.09_2285', 'DF.classifications_0.25_0.09_2285', 'pANN_0.25_0.09_1241', 'DF.classifications_0.25_0.09_1241', 'SCT_snn_res.0.4', 'ATAC_snn_res.0.4', 'wsnn_res.0.4', 'm1c_labels_subclass', 'age.group', 'seurat_clusters_origBB', 'm1c_labels_subclass_origBB', 'anno_clus', 'anno_clus_origBB', 'anno_clus_origBB2', 'SCT.dream2BB.weight', 'ATAC.dream2BB.weight', 'seurat_clusters_dream.origBB', 'seurat_clusters_dreamorigBB', 'm1c_labels_subclass.dreamBB', 'anno_clus_dreamBB', 'anno_clus_dreamorigBB', 'OPC_progenitor1', 'OPC_precursor1', 'Oligo_precursor1', 'preOligo1', 'immOligo1', 'matOligo_non1', 'matOligo_mye1', 'TypeI_opc1', 'TypeII_opc1', 'TypeI_Olig1', 'TypeII_Olig1', 'anno_clus_dreamorigBB_bioarx', 'anno_clus_dreamorigBB_v2', 'n_genes', 'leiden', 'alignment000_dpt_pseudotime', 'alignment001_dpt_pseudotime', 'alignment002_dpt_pseudotime', 'alignment003_dpt_pseudotime', 'alignment004_dpt_pseudotime', 'alignment005_dpt_pseudotime', 'alignment006_dpt_pseudotime', 'alignment007_dpt_pseudotime', 'alignment008_dpt_pseudotime', 'alignment009_dpt_pseudotime', 'alignment010_dpt_pseudotime', 'alignment011_dpt_pseudotime', 'alignment012_dpt_pseudotime', 'alignment013_dpt_pseudotime'
    var: 'vst.mean', 'vst.variance', 'vst.variance.expected', 'vst.variance.standardized', 'vst.variable', 'n_cells', 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
    uns: 'log1p', 'hvg', 'pca', 'neighbors', 'diffmap_evals', 'umap', 'leiden', 'paga', 'leiden_sizes', 'leiden_colors', 'anno_clus_dreamorigBB_v2_colors', 'cluster_centroid', 'capital', 'root_cell'
    obsm: 'X_pca', 'X_diffmap', 'X_umap'
    obsp: 'distances', 'connectivities'_

Thank you for any help!

All the best, Tereza

Rsugihara01 commented 1 year ago

Dear @TerezaClarence

Thank you for your reply.

Your result of pseudotime analysis ad1.obs["alignment001_dpt_pseudotime"] stores Nan . That means something went wrong in the calculation of pseudotime analysis, which uses scanpy.tl.dpt().

Please check the trajectory tree that cp.tl.trajectory_tree have calculated shows reliable result.
cp.tl.trajectory_tree does calculate trajectory tree but not all ways return biologically right answer. That might have lead scanpy.tl.dpt to return Nan. Maybe trying scanpy.tl.dpt for each dataset can show the reason of the error.

I apologize for the inconvenience CAPITAL caused by mechanism to check for some errors was not working. We are trying to update CAPITAL to resolve the errors and get it working soon.

Best, Rsugihara01

Rsugihara01 commented 1 year ago

Dear @TerezaClarence

I am sorry for late reply.

I just updated CAPITAL and fixed some bugs in that process. I hope this update helps your problems!

Best, Rsugihara01

TerezaClarence commented 1 year ago

Dear @Rsugihara01 ,

thank you so much for the update. I have updated the capital and re-run the same analysis as posted above earlier, and I get the same errors. Can you please guide me if I should also update some other packages or what I should do to help and fix the bugs? Would it be helpful to share a downsampled version of the data itself?

Thank you so much!

Best, Tereza

Rsugihara01 commented 1 year ago

Hi @TerezaClarence

Thank you for your quick reply!

I am so sorry that it still making the same errors. It would be really helpful if we could check out your data. Could you use something like dropbox or google drive to share us your downsampled dataset?

Thank you!

Best, Rsugihara01

ykat0 commented 1 year ago

Hi Tereza,

I am a supervisor of Reiichi.

I think it unsuitable to share the link in this open dialog box. Could you access my website to contact me by email? https://www.med.osaka-u.ac.jp/pub/rna/ykato/en/profile.html

BW,

Yuki

rkelly712 commented 1 week ago

Hi,

I've run into the same issue. I'm wondering if there was a solution? I've checked everything described above but i cant seem to generate that plot (it does work fine on the tutorial dataset).

Thanks in advance.



IndexError                                Traceback (most recent call last)
Cell In[100], line 2
      1 for alignment, gene in main_markers:
----> 2     cp.pl.gene_expression_trend(
      3         cdata, gene=gene, alignment=alignment, fontsize=16, ticksize=16, multi_genes=True, switch_psedotime=True,
      4         data1_name="control", data2_name="knockout", polyfit_dimension=3
      5     )

File /data/set/conda/envs/capital/lib/python3.9/site-packages/capital/pl/pl.py:351, in gene_expression_trend(aligned_data, gene, alignment, col_pseudotime, outliers, polyfit_dimension, switch_psedotime, multi_genes, data1_name, data2_name, data1_color, data2_color, data1_line_color, data2_line_color, ncols, widthspace, heightspace, fontsize, legend_fontsize, ticksize, dpi, show, save)
    348 else:
    349     pseudotime = data1[ordered_cells1, :].obs[col_pseudotime_tmp][[i for i, _ in path]].values
--> 351 data1_expression_level = expression1[[i for i, _ in path]]
    352 data2_expression_level = expression2[[j for _, j in path]]
    354 array = np.array(
    355     [pseudotime, data1_expression_level, data2_expression_level])

File ~/.local/lib/python3.9/site-packages/scipy/sparse/_index.py:47, in IndexMixin.__getitem__(self, key)
     46 def __getitem__(self, key):
---> 47     row, col = self._validate_indices(key)
     49     # Dispatch to specialized methods.
     50     if isinstance(row, INT_TYPES):

File ~/.local/lib/python3.9/site-packages/scipy/sparse/_index.py:159, in IndexMixin._validate_indices(self, key)
    157         row += M
    158 elif not isinstance(row, slice):
--> 159     row = self._asindices(row, M)
    161 if isintlike(col):
    162     col = int(col)

File ~/.local/lib/python3.9/site-packages/scipy/sparse/_index.py:191, in IndexMixin._asindices(self, idx, length)
    189 max_indx = x.max()
    190 if max_indx >= length:
--> 191     raise IndexError('index (%d) out of range' % max_indx)
    193 min_indx = x.min()
    194 if min_indx < 0:

IndexError: index (3336) out of range
Rsugihara01 commented 1 week ago

Hi @rkelly712,

Can you try the code below and check what is stored in ordered_cells1 and expression1?

alignment, gene = main_markers[0]
dtw_dic = cdata.alignmentdict[alignment]  # get results of dtw
ordered_cells1 = dtw_dic["multi_genes"]["ordered_cells1"] # get cells in the order of pseudotime
expression1 = cdata.adata1.raw.to_adata()[ordered_cells1, gene].X.T[0] # extract the gene expression

I thought I added code to detect NaN or Inf values that might cause this error, but something might not be working. Sorry for the trouble, I'll do my best to resolve it.

Rsugihara

rkelly712 commented 1 week ago

Hi,

Thanks for the response. I seem to have resolved the issue. It was related to my original count matrix which was a sparse matrix (see below). I'd initially transferred it over from Seurat using sceasy. When i converted the count matrix (adata.X) to an array (using .toarray()), and reran the code, it worked and the plots were generated.

As an aside, i was wondering the best way to visualise the pseudotime calculated within the capital object (e.g., pseudotime shading on a UMAP)?

Thanks again for your help.

My data

print(ordered_cells1)
['control_GGGTCTGCATCTTTCA-1' 'control_GCTACAAGTATGCTAC-1'
 'control_TCAATCTAGAGGTTTA-1' ... 'control_AGAGAGCCATGACTAC-1'
 'control_AGGAGGTGTAGAGTTA-1' 'control_ACTCCCACATAGATGA-1']
print(expression1)
 (0, 2340)  0.4959984618980781
  (0, 798)  0.6116028258921168
  (0, 1158) 1.4511892554442025
  (0, 1443) 0.30373649416165527
  (0, 1226) 0.34609063497425857
  (0, 2809) 0.4871480619933514
  (0, 1147) 0.5640007650984545
  (0, 2444) 0.7069003794929363
  (0, 1931) 0.8323076259519872
  (0, 1491) 0.9875451949771632
  (0, 1818) 0.8621991838468781
  (0, 109)  0.3232153812205226
  (0, 1500) 0.9642279384065428
  (0, 2499) 0.8484932949873287
  (0, 2066) 0.7104871728125176
  (0, 2043) 0.9070024178459825
  (0, 2429) 0.550274895772055
  (0, 2533) 0.5153461351773089
  (0, 1240) 0.5052094090355193
  (0, 2368) 1.4316806287998687
  (0, 1053) 0.5503370071136547
  (0, 722)  0.3456074010109199
  (0, 2023) 0.3740411910445645
  (0, 685)  0.2996916471933727
  (0, 2019) 0.669468018605347
  : :
  (0, 1281) 1.4289227475093744
  (0, 1557) 0.8641804941825776
  (0, 1588) 0.47031133425662525
  (0, 121)  0.9089884440550287
  (0, 1380) 0.7326068212414114
  (0, 1776) 0.5867928407271846
  (0, 1715) 0.4596904987904949
  (0, 1756) 0.5259641072774864
  (0, 505)  0.8139279666643631
  (0, 2112) 0.7753590767187829
  (0, 1966) 0.5477714065030911
  (0, 1533) 0.41071650267454274
  (0, 1668) 0.5583757561343384
  (0, 2002) 0.4733958832079673
  (0, 2048) 0.5841126578725687
  (0, 719)  0.8068890530804181
  (0, 1154) 1.0456094380085386
  (0, 1439) 0.6775218626568644
  (0, 786)  1.4002551341476428
  (0, 2231) 0.648450651936526
  (0, 1572) 0.38759981327303544
  (0, 1807) 1.3803241941333868
  (0, 1505) 0.8725681543321432
  (0, 945)  0.8584219611940296
  (0, 2379) 1.1460096712188628

type(expression1)
scipy.sparse._csr.csr_matrix # THIS SHOULD BE AN ARRAY

Tutorial data print(ordered_cells1) print(expression1) print(type(expression1)) ['Run5_205518993255332' 'Run5_195625773119846' 'Run4_227990530578269' ... 'Run5_227493088247030' 'Run4_126707773105894' 'Run5_134540497440030'] [0.0314418 0.0463666 0.036620434 ... 0.55468273 0.61724263 0.55924 ] anndata._core.views.ArrayView

Rsugihara01 commented 1 week ago

Hi @rkelly712

I'm glad the code is working. Sorry for any trouble caused. I will update the code soon to support sparse matrices as well.

The calculated pseudotime is stored in cdata.adata1.obs for each alignment, like "alignment001_dpt_pseudotime". So, you can visualize the pseudotime with the following code:

adata_tmp = cdata.adata1.copy()
sc.pl.umap(adata_tmp, color=["alignment001_dpt_pseudotime"]) # change `alignmentXXX` to the one you want to plot

Does that work for you?

Sorry for the inconvenience, hope this help!

Rsugihara