Using 10x genomics data

Hi! I'm brand new to using immunarch and am preparing to use it to analyze some cellranger vdj data from 10x genomics. I was a little confused in the "Loading 10x Genomics Data" vignette, because originally it says "You should use the filtered contigs csv files because they contain barcode information.", but then in the load into immunarch step, it appears that you load all of the csv files from cellranger output, not just the filtered_contig file.

Would you recommend using all of the csv files or only one at a time?

When I followed the vignette and loaded all of the output csv files from some sample data I got this warning:

Warning messages: 1: The following named parsers don't match the column names: barcode,is_cell,contig_id,high_confidence,length,chain,v_gene,d_gene,j_gene,c_gene,full_length,productive,fwr1,fwr1_nt,cdr1,cdr1_nt,fwr2,fwr2_nt,cdr2,cdr2_nt,fwr3,fwr3_nt,cdr3,cdr3_nt,fwr4,fwr4_nt,reads,umis,raw_clonotype_id,raw_consensus_id,exact_subclonotype_id 2: The following named parsers don't match the column names: clonotype_id,consensus_id,length,chain,v_gene,d_gene,j_gene,c_gene,full_length,productive,fwr1,fwr1_nt,cdr1,cdr1_nt,fwr2,fwr2_nt,cdr2,cdr2_nt,fwr3,fwr3_nt,cdr3,cdr3_nt,fwr4,fwr4_nt,reads,umis,v_start,v_end,v_end_ref,j_start,j_start_ref,j_end,fwr1_start,fwr1_end,cdr1_start,cdr1_end,fwr2_start,fwr2_end,cdr2_start,cdr2_end,fwr3_start,fwr3_end,cdr3_start,cdr3_end,fwr4_start,fwr4_end 3: The following named parsers don't match the column names: barcode,is_cell,contig_id,high_confidence,length,chain,v_gene,d_gene,j_gene,c_gene,full_length,productive,fwr1,fwr1_nt,cdr1,cdr1_nt,fwr2,fwr2_nt,cdr2,cdr2_nt,fwr3,fwr3_nt,cdr3,cdr3_nt,fwr4,fwr4_nt,reads,umis,raw_clonotype_id,raw_consensus_id,exact_subclonotype_id

Is this because I loaded all of the csv files even though their structures are different?

Also, I was following your single-cell paired data vignette, and I'm a little confused on how to create the cluster specific datasets. Your sample data (scdata) already contains information about the cell clusters, but how would you use this function if your immunarch input data is the output of celranger multi? Is there a way to merge the filtered contigs annotations csv with a Seurat object containing the clusters made from cellranger count?

Also, what further analyses would you recommend for data from 10x genomics? There is a link for exploring the dataset on the loading tutorial, but it says "Page not found", and the other tutorials seem to be for data that is structured differently than the loaded 10x data.

Sorry for the seemingly simple questions, I am just brand new to this package and want to make sure I understand it properly.

immunomind / immunarch

Using 10x genomics data #256