massonix / HCATonsilData

Provide programmatic access to the tonsil cell atlas datasets
Other
7 stars 2 forks source link

How to access processed TCR scVDJ-seq data for T cells that also have GEX/CITE-seq Seurat objects? #10

Open denvercal1234GitHub opened 1 year ago

denvercal1234GitHub commented 1 year ago

Hi there,

Congrats on a great work, and thanks for developing this package.

I am attempting to pull TCR-seq data for T cells that also have the GEX/CITE-seq Seurat objects.

QUESTION 1. I saw the output of cellranger vdj deposited at https://zenodo.org/record/6678331#.Y_iuOZPP00R within the CITE-seq folder. Would you mind clarifying what the subfolders (shown below, e.g., BCLLATLAST_XX, ifZOgenn_TpMNTvBa, mLuLpVxi_v0fLyotc, etc.) refer to? They are probably sequencing runs and gem ID.

QUESTION 2. I noticed you used scirpy for the immune receptor repertoire analysis, and I could pull TCR from the vdj_t folders "barcode" that matched the cell barcodes of the meta.data of the T cell Seurat objects. But, would you mind letting me know whether you might already have the processed sc-VDJ data for T cells as an R/Python object - the ones that have the matched Seurat GEX/CITE-seq objects for T cells?

QUESTION 3. It looks like only the CITE-seq-ed cells have TCR data, but only Seurat object with ADT imputed was deposited. The cell barcodes are not the same between Seurat GEX object and TCR vdj, so I could not match the cells between these 2 datasets. Do you have CITE-seq Seurat object for CD8 T cells?

Thank you again very much for your help!

PS. I emailed you @massonix per issue #8

Screenshot 2023-02-24 at 14 35 47

Screenshot 2023-02-24 at 14 41 03

denvercal1234GitHub commented 1 year ago

Ramon has kindly provided spatial transcriptomic data.

Awaiting TCR data.

massonix commented 1 year ago

Hi again Quang!

Answering to your questions:

QUESTION 1.: as you correctly pointed out, BCLLATLAS_XX corresponds to a run of the 10X Chromium instrument. This is what you will find labeled as "subproject". Within each subproject, you have the output from multiple 10X Channels. Using 10X terminology, each channel corresponds to a "GEM Well" (Gel Bead-in-Emulsion). Each GEM well was assigned a specific "gem_id" following the best practices of this paper. Finally, each GEM well can lead to different Illumina libraries (library_ids). For instance, for each GEM well in the CITE-seq experiments, we have 4 Illumina libraries: GEX, ADT, TCR and BCR. To understand the relationship between subprojects, gem_ids, and library_ids, please have a look at the metadata file, which is in TonsilAtlasCellRangerOuts/CITE-seq/metadata/cellranger_multi_metadata.csv. Finally, to understand the proteins that we included in our CITE-seq panel you can check the file: TonsilAtlasCellRangerOuts/CITE-seq/metadata/cite_seq_feature_reference.csv.

Question 2.: please find the output of our scirpy analysis here. In any case, I will prepare you the Seurat objects with the scTCR and CITE-seq data, which already include this information.

QUESTION 3.: as you correctly pointed out, the only datasets that have TCR data are the CITE-seq ones. I will prepare you the objects and get back to you ASAP!

Thanks for your interest in our work and datasets, I hope they are useful for your analysis!

Best,

Ramon

denvercal1234GitHub commented 1 year ago

Thank you again @massonix regarding Questions 2 and 3! Looking forward to receiveing it.

Q1a. Are there any thing special about the BCLL-15-T, BCLL-8-T, BCLL-9-T donors compared to the rest of the other donors (i.e., donors 0-3) in the CITE-seq or other experiments done for them? Because in the Seurat CITE-seq object, they have significant more cells than other donors!

Figure 1A of the manuscript shows there are 7 donors that had CITE-seq and VDJ? This would agree with what I saw when I examined the Seurat object meta.data$donor_id.

Screenshot 2023-03-12 at 21 53 13 Screenshot 2023-03-12 at 21 59 45

Q1b. Yet, Table_S2 shows there are only BCLL-8-T, BCLL-9-T, and BCLL-15-T for CITE-seq. What are the "donor_0", 1, 2, 3 (i.e., which BCLL-XX-T --- 2, 10, 11, 6, 12, 13, or 14) in the Seurat CITEseq object?

Screenshot 2023-03-12 at 21 56 31

Thank you again!!