Closed Nithishwer closed 2 months ago
hi @Nithishwer,
is it because you are trying to use mu.pl.embedding
instead of sc.pl.embedding
?
Hi @zktuong,
I'm sorry, I think I uploaded the wrong output. Unrelatedly, mu.pl.embedding and sc.pl.embedding seem to run into this error.
But my issue is converting the ddl object into an scirpy object with gex. The ddl.to_scirpy call itself runs into an error. the plotting errors are just downstream errors due to a faulty object when I pass to mudata=True.
Let me know what you think!
Ok so your error log is saying that the adata
that you are trying to pass doesn't have as many cells as the number of cells in your vdj
object.
so perhaps a quick solution could be to subset your vdj
object to contain the same cells as in your adata
?
Hi Kelvin,
I looked into the size of the adata and vdj objects and this is how it seems:
VDJ object:
Dandelion class object with n_obs = 11259 and n_contigs = 20627
data: 'sequence_id', 'sequence', 'rev_comp', 'productive', 'v_call', 'd_call', 'j_call', 'sequence_alignment', 'germline_alignment', 'junction', 'junction_aa', 'v_cigar', 'd_cigar', 'j_cigar', 'stop_codon', 'vj_in_frame', 'locus', 'c_call', 'junction_length', 'np1_length', 'np2_length', 'v_sequence_start', 'v_sequence_end', 'v_germline_start', 'v_germline_end', 'd_sequence_start', 'd_sequence_end', 'd_germline_start', 'd_germline_end', 'j_sequence_start', 'j_sequence_end', 'j_germline_start', 'j_germline_end', 'v_score', 'v_identity', 'v_support', 'd_score', 'd_identity', 'd_support', 'j_score', 'j_identity', 'j_support', 'fwr1', 'fwr2', 'fwr3', 'fwr4', 'cdr1', 'cdr2', 'cdr3', 'cell_id', 'consensus_count', 'umi_count', 'v_call_10x', 'j_call_10x', 'junction_10x', 'junction_10x_aa', 'j_support_igblastn', 'j_score_igblastn', 'j_call_igblastn', 'j_call_blastn', 'j_identity_blastn', 'j_alignment_length_blastn', 'j_number_of_mismatches_blastn', 'j_number_of_gap_openings_blastn', 'j_sequence_start_blastn', 'j_sequence_end_blastn', 'j_germline_start_blastn', 'j_germline_end_blastn', 'j_support_blastn', 'j_score_blastn', 'j_sequence_alignment_blastn', 'j_germline_alignment_blastn', 'd_support_igblastn', 'd_score_igblastn', 'd_call_igblastn', 'd_call_blastn', 'd_identity_blastn', 'd_alignment_length_blastn', 'd_number_of_mismatches_blastn', 'd_number_of_gap_openings_blastn', 'd_sequence_start_blastn', 'd_sequence_end_blastn', 'd_germline_start_blastn', 'd_germline_end_blastn', 'd_support_blastn', 'd_score_blastn', 'd_sequence_alignment_blastn', 'd_germline_alignment_blastn', 'd_source', 'junction_aa_length', 'fwr1_aa', 'fwr2_aa', 'fwr3_aa', 'fwr4_aa', 'cdr1_aa', 'cdr2_aa', 'cdr3_aa', 'sequence_alignment_aa', 'v_sequence_alignment_aa', 'd_sequence_alignment_aa', 'j_sequence_alignment_aa', 'j_call_multimappers', 'j_call_multiplicity', 'j_call_sequence_start_multimappers', 'j_call_sequence_end_multimappers', 'j_call_support_multimappers', 'rearrangement_status', 'j_source', 'clone_id'
metadata: 'clone_id', 'clone_id_by_size', 'locus_VDJ', 'locus_VJ', 'productive_VDJ', 'productive_VJ', 'v_call_VDJ', 'd_call_VDJ', 'j_call_VDJ', 'v_call_VJ', 'j_call_VJ', 'c_call_VDJ', 'c_call_VJ', 'junction_VDJ', 'junction_VJ', 'junction_aa_VDJ', 'junction_aa_VJ', 'v_call_abT_VDJ', 'd_call_abT_VDJ', 'j_call_abT_VDJ', 'v_call_abT_VJ', 'j_call_abT_VJ', 'c_call_abT_VDJ', 'c_call_abT_VJ', 'productive_abT_VDJ', 'productive_abT_VJ', 'umi_count_abT_VDJ', 'umi_count_abT_VJ', 'v_call_VDJ_main', 'v_call_VJ_main', 'd_call_VDJ_main', 'j_call_VDJ_main', 'j_call_VJ_main', 'c_call_VDJ_main', 'c_call_VJ_main', 'v_call_abT_VDJ_main', 'd_call_abT_VDJ_main', 'j_call_abT_VDJ_main', 'v_call_abT_VJ_main', 'j_call_abT_VJ_main', 'locus_status', 'chain_status', 'rearrangement_status_VDJ', 'rearrangement_status_VJ'
layout: layout for 7702 vertices, layout for 7702 vertices
graph: networkx graph of 7702 vertices, networkx graph of 7702 vertices
adata object:
AnnData object with n_obs × n_vars = 16405 × 15981
obs: 'orig.ident', 'nCount_RNA', 'nFeature_RNA', 'Sample.Type', 'cell_type', 'source', 'RNA_snn_res.1', 'seurat_clusters', 'cell_type_overall', 'refined_annotations', 'filter_rna', 'has_contig', 'filter_contig_quality', 'filter_contig_VDJ', 'filter_contig_VJ', 'contig_QC_pass', 'filter_contig', 'locus_VDJ', 'locus_VJ', 'productive_VDJ', 'productive_VJ', 'v_call_VDJ', 'd_call_VDJ', 'j_call_VDJ', 'v_call_VJ', 'j_call_VJ', 'c_call_VDJ', 'c_call_VJ', 'junction_VDJ', 'junction_VJ', 'junction_aa_VDJ', 'junction_aa_VJ', 'v_call_abT_VDJ', 'd_call_abT_VDJ', 'j_call_abT_VDJ', 'v_call_abT_VJ', 'j_call_abT_VJ', 'c_call_abT_VDJ', 'c_call_abT_VJ', 'productive_abT_VDJ', 'productive_abT_VJ', 'umi_count_abT_VDJ', 'umi_count_abT_VJ', 'v_call_VDJ_main', 'v_call_VJ_main', 'd_call_VDJ_main', 'j_call_VDJ_main', 'j_call_VJ_main', 'c_call_VDJ_main', 'c_call_VJ_main', 'v_call_abT_VDJ_main', 'd_call_abT_VDJ_main', 'j_call_abT_VDJ_main', 'v_call_abT_VJ_main', 'j_call_abT_VJ_main', 'locus_status', 'chain_status', 'rearrangement_status_VDJ', 'rearrangement_status_VJ'
var: 'vst.mean', 'vst.variance', 'vst.variance.expected', 'vst.variance.standardized', 'vst.variable'
uns: 'neighbors', 'refined_annotations_colors'
obsm: 'X_harmony', 'X_pca', 'X_umap'
varm: 'HARMONY', 'PCs'
obsp: 'connectivities', 'distances'
It seems like vdj object has lesser cells than the adata object. Also, all the cell barcodes present in the vdj object are present in the data object.
Can you elaborate what the numbers 11259 and 11346 indicate in the error message?
hmm that's weird. are there any duplicated cell barcodes by any chance?
There seems to be some duplicate barcodes in the data object
print(len(adata.obs.index),len(np.unique(adata.obs.index)))
returns:
16405 16302
I removed all the duplicates with:
adata = adata[~adata.obs.index.duplicated(keep='first')]
and it seems to work now.
ok. well, instead of removing them, i would try and make them unique in the future (e.g. https://anndata.readthedocs.io/en/latest/generated/anndata.AnnData.obs_names_make_unique.html). anyway. closing this for now.
I encountered an issue while converting TCR data from a Dandelion object to a Scirpy object for further analysis. Despite successfully performing all analyses with Dandelion, I received an error about object shape mismatches during the conversion.
Steps to Reproduce
mudata=False
and supplyinggex
to the conversion function.Behavior
Received an error indicating that some objects are not in the correct shape. The issue does not occur when setting
mudata=True
or omittinggex
in the conversion function.Minimal example
Any error message produced by the code above
OS information
Linux
Version information
dandelion==0.3.5 pandas==2.2.0 numpy==1.26.4 matplotlib==3.8.4 networkx==3.3 scipy==1.11.4 scanpy==1.10.0 anndata==0.10.6 umap==0.5.5 numpy==1.26.4 scipy==1.11.4 pandas==2.2.0 scikit-learn==1.4.1.post1 statsmodels==0.14.1 igraph==0.11.4 pynndescent==0.5.11
Additional context
No response