immunomind / immunarch

🧬 Immunarch: an R Package for Fast and Painless Exploration of Single-cell and Bulk T-cell/Antibody Immune Repertoires
https://immunarch.com
Apache License 2.0
298 stars 65 forks source link

paired data issue #191

Closed christoforos-dimitropoulos closed 9 months ago

christoforos-dimitropoulos commented 2 years ago

Hey everyone. I am trying to use this tutorial (https://immunarch.com/articles/web_only/v21_singlecell.html) in order to load my 10x TCR data in order to track clonotypes across seurat clusters (as shown at the very bottom of the tutorial) However I face specific problems, I provide here my code (seu_all is my seurat object and the idents of the object are set as clusters):

barcode_cluster<-Seurat::Idents(seu_all) immdata_10x <- repLoad("~/sc_repertoire/scRepertoire_c57bl6j/bl6_tcr_data",.mode = "paired")

== Step 1/3: loading repertoire files... ==

Processing "~/sc_repertoire/scRepertoire_c57bl6j/bl6_tcr_data" ...
  -- [1/2] Parsing "/home/testuser11/sc_repertoire/scRepertoire_c57bl6j/bl6_tcr_data/all_contig_annotations.csv" -- 10x (filt.contigs)
  [!] Removed 16155 clonotypes with no nucleotide and amino acid CDR3 sequence.                                                                                                                        
  -- [2/2] Parsing "/home/testuser11/sc_repertoire/scRepertoire_c57bl6j/bl6_tcr_data/clonotypes.csv" -- unsupported format, skipping

== Step 2/3: checking metadata files and merging files... ==

Processing "~/sc_repertoire/scRepertoire_c57bl6j/bl6_tcr_data" ...
  -- Metadata file not found; creating a dummy metadata...

== Step 3/3: processing paired chain data... ==

Done!

Warning message:
The following named parsers don't match the column names: barcode,is_cell,contig_id,high_confidence,length,chain,v_gene,d_gene,j_gene,c_gene,full_length,productive,fwr1,fwr1_nt,cdr1,cdr1_nt,fwr2,fwr2_nt,cdr2,cdr2_nt,fwr3,fwr3_nt,cdr3,cdr3_nt,fwr4,fwr4_nt,reads,umis,raw_clonotype_id,raw_consensus_id,exact_subclonotype_id 

scdata_cl <- select_clusters(immdata_10x$data$all_contig_annotations, barcode_cluster, "Cluster")

Error in select_clusters(immdata_10x$data[[1]], barcode_cluster, "Cluster") : 
  Please provide a list with both 'data' and 'meta' elements.

Any input would be appreciated, thanks a lot!

MVolobueva commented 2 years ago

Hi @christoforos-dimitropoulos!

My name is Maria Volobueva, I am the developer of Immunarch. Thank you for using the package! Could you provide an example of the file you try to load, as we will be able to quickly identify the issues.

christoforos-dimitropoulos commented 2 years ago

Hi @christoforos-dimitropoulos!

My name is Maria Volobueva, I am the developer of Immunarch. Thank you for using the package! Could you provide an example of the file you try to load, as we will be able to quickly identify the issues.

As an input I give the folder with the output of the 10x cellranger vdj pipeline.

MVolobueva commented 2 years ago

@christoforos-dimitropoulos,

Please compress the folder that you were using as input for repLoad (~/sc_repertoire/scRepertoire_c57bl6j/bl6_tcr_data in your case) and attach it to the message, so we can reproduce the issue and help you out. You can also send the data to our tech support at support@immunomind.io.

Best regards, Maria

mbartl13 commented 2 years ago

Hi I am having a similar issue, the headers look the same to me though in the attached file.

`>immdata<-repLoad("~/TCR/")

== Step 1/3: loading repertoire files... ==

Processing "~/TCR/" ... -- [1/7] Parsing "C:/Users/Maggie/Documents/TCR/d0_43F_PBMC_20211209_filtered_contig_annotations.csv" -- 10x (filt.contigs) -- [2/7] Parsing "C:/Users/Maggie/Documents/TCR/d11_13F_PBMC_20220106_filtered_contig_annotations.csv" -- 10x (filt.contigs) -- [3/7] Parsing "C:/Users/Maggie/Documents/TCR/d168_13F_LN_20220128_filtered_contig_annotations.csv" -- 10x (filt.contigs) 0s -- [4/7] Parsing "C:/Users/Maggie/Documents/TCR/d168_13F_PBMC_20211209_filtered_contig_annotations.csv" -- 10x (filt.contigs) -- [5/7] Parsing "C:/Users/Maggie/Documents/TCR/d168_13F_PBMC_filtered_contig_annotations.csv" -- 10x (filt.contigs) -- [6/7] Parsing "C:/Users/Maggie/Documents/TCR/d176_43F_LN_20211209_filtered_contig_annotations.csv" -- 10x (filt.contigs) -- [7/7] Parsing "C:/Users/Maggie/Documents/TCR/d43_13F_PBMC_20211209_filtered_contig_annotations.csv" -- 10x (filt.contigs)

== Step 2/3: checking metadata files and merging files... ==

Processing "~/TCR/" ... -- Metadata file not found; creating a dummy metadata...

== Step 3/3: processing paired chain data... ==

Done!

Warning messages: 1: The following named parsers don't match the column names: barcode,is_cell,contig_id,high_confidence,length,chain,v_gene,d_gene,j_gene,c_gene,full_length,productive,fwr1,fwr1_nt,cdr1,cdr1_nt,fwr2,fwr2_nt,cdr2,cdr2_nt,fwr3,fwr3_nt,cdr3,cdr3_nt,fwr4,fwr4_nt,reads,umis,raw_clonotype_id,raw_consensus_id,exact_subclonotype_id 2: The following named parsers don't match the column names: barcode,is_cell,contig_id,high_confidence,length,chain,v_gene,d_gene,j_gene,c_gene,full_length,productive,fwr1,fwr1_nt,cdr1,cdr1_nt,fwr2,fwr2_nt,cdr2,cdr2_nt,fwr3,fwr3_nt,cdr3,cdr3_nt,fwr4,fwr4_nt,reads,umis,raw_clonotype_id,raw_consensus_id,exact_subclonotype_id 3: The following named parsers don't match the column names: barcode,is_cell,contig_id,high_confidence,length,chain,v_gene,d_gene,j_gene,c_gene,full_length,productive,fwr1,fwr1_nt,cdr1,cdr1_nt,fwr2,fwr2_nt,cdr2,cdr2_nt,fwr3,fwr3_nt,cdr3,cdr3_nt,fwr4,fwr4_nt,reads,umis,raw_clonotype_id,raw_consensus_id,exact_subclonotype_id 4: The following named parsers don't match the column names: barcode,is_cell,contig_id,high_confidence,length,chain,v_gene,d_gene,j_gene,c_gene,full_length,productive,fwr1,fwr1_nt,cdr1,cdr1_nt,fwr2,fwr2_nt,cdr2,cdr2_nt,fwr3,fwr3_nt,cdr3,cdr3_nt,fwr4,fwr4_nt,reads,umis,raw_clonotype_id,raw_consensus_id,exact_subclonotype_id 5: The following named parsers don't match the column names: barcode,is_cell,contig_id,high_confidence,length,chain,v_gene,d_gene,j_gene,c_gene,full_length,productive,fwr1,fwr1_nt,cdr1,cdr1_nt,fwr2,fwr2_nt,cdr2,cdr2_nt,fwr3,fwr3_nt,cdr3,cdr3_nt,fwr4,fwr4_nt,reads,umis,raw_clonotype_id,raw_consensus_id,exact_subclonotype_id 6: The following named parsers don't match the column names: barcode,is_cell,contig_id,high_confidence,length,chain,v_gene,d_gene,j_gene,c_gene,full_length,productive,fwr1,fwr1_nt,cdr1,cdr1_nt,fwr2,fwr2_nt,cdr2,cdr2_nt,fwr3,fwr3_nt,cdr3,cdr3_nt,fwr4,fwr4_nt,reads,umis,raw_clonotype_id,raw_consensus_id,exact_subclonotype_id 7: The following named parsers don't match the column names: barcode,is_cell,contig_id,high_confidence,length,chain,v_gene,d_gene,j_gene,c_gene,full_length,productive,fwr1,fwr1_nt,cdr1,cdr1_nt,fwr2,fwr2_nt,cdr2,cdr2_nt,fwr3,fwr3_nt,cdr3,cdr3_nt,fwr4,fwr4_nt,reads,umis,raw_clonotype_id,raw_consensus_id,exact_subclonotype_id `

d0_43F_PBMC_20211209_filtered_contig_annotations.csv

Alexander230 commented 2 years ago

Hi, @mbartl13!

Thank you very much for providing the data! Looks like this format of column names is unknown to the parser. I will work to add the support for this format in the future versions of Immunarch.

Best regards, Aleksandr

vadimnazarov commented 9 months ago

Closing this issue for now. It will be implemented in the next version of Immunarch.

More details on the next version of Immunarch are here: https://b-t.cr/t/immunarch-will-significantly-evolve-but-it-will-break-things-and-we-need-your-help/1123