zhangzlab / covid_balf

R script for the covid balf data analysis
63 stars 35 forks source link

confusion regarding 10x technology used for each sample #14

Closed Xiaojieqiu closed 3 years ago

Xiaojieqiu commented 3 years ago

hey @zhangzlab, thanks for this nice work.

I had a difficult time to figure out which 10x technology used for the data you generated. In your paper, you stated:

Single-cell capturing and downstream library constructions were performed using Chromium Single Cell 3g Reagent kits v.2 (10x Genomics; PN-120237, PN-120236 and PN-120262) for HC1 and HC2, kit v.3 (10x Genomics; PN-1000075, PN-1000073 and PN-120262) for HC3 and Chromium Single Cell V(D)J Reagent kits (10x Genomics; PN-1000006, PN-1000014, PN-1000020, PN-1000005) for all patients with COVID-19, according to the manufacturer’s protocol.

Based on this, all the health controls are measured with either 10x v2 or v3 scRNA-seq technology while the patient samples are measured with vdj technology. However, it did seems like 10x scRNA-seq is also applied to the patient samples based on the GEO records (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE145926). Could you please provide a file that maps the GSM ID to the 10x scRNA-seq technology (i.e. 10xv2 or 10xv3) you used?

This can be of great help for me as I would like to apply my recently developed computational framework dynamo (https://github.com/aristoteleo/dynamo-release) to perform some vector field enabled analysis.

Thanks in advance!

Xiaojieqiu commented 3 years ago
Run sample GEO_Accession (exp) sample_new experiment 10x_scRNA_seq_tech
SRR11537946 C51 GSM4475048 HC1 scRNA-seq 10xV3
SRR11537947 C52 GSM4475049 HC2 scRNA-seq 10xV3
SRR11537948 C100 GSM4475050 HC3 scRNA-seq 10xV3
SRR11181955 C142 GSM4339770 M2 scRNA-seq 10xV2
SRR11181957 C144 GSM4339772 M3 scRNA-seq 10xV2
SRR11245351 C141 GSM4385990 M1 TCR-seq None
SRR11245352 C142 GSM4385991 M2 TCR-seq None
SRR11537949 C148 GSM4475051 S4 scRNA-seq 10xV2
SRR11537950 C149 GSM4475052 S5 scRNA-seq 10xV2
SRR11537951 C152 GSM4475053 S6 scRNA-seq 10xV2
SRR11537952 C148 GSM4475054 S4 TCR-seq None
SRR11537953 C149 GSM4475055 S5 TCR-seq None
SRR11537954 C152 GSM4475056 S6 TCR-seq None
SRR11181954 C141 GSM4339769 M1 scRNA-seq 10xV3
SRR11181956 C143 GSM4339771 S2 scRNA-seq 10xV2
SRR11181958 C145 GSM4339773 S1 scRNA-seq 10xV2
SRR11181959 C146 GSM4339774 S3 scRNA-seq 10xV2
SRR11245353 C143 GSM4385992 S2 TCR-seq None
SRR11245354 C144 GSM4385993 M3 TCR-seq None
SRR11245355 C145 GSM4385994 S1 TCR-seq None
SRR11245356 C146 GSM4385995 S3 TCR-seq None

I end up manually running some script to determine the 10x technology used for the scRNA-seq.The above is the final table which may be useful for others too

zhangzlab commented 3 years ago

Thanks for pointing out the inconsistency between the paper and GEO. The metadata in GEO is updated at two different times so some information is not correct. I'll correct the GEO record. And please refer to the following table.

Donor Method
HC1 10X 3’ V2
HC2 10X 3’ V2
HC3 10X 3’ V3
M1 10X v(d)j & 5' gene expression
M2 10X v(d)j & 5' gene expression
M3 10X v(d)j & 5' gene expression
S1 10X v(d)j & 5' gene expression
S2 10X v(d)j & 5' gene expression
S3 10X v(d)j & 5' gene expression
S4 10X v(d)j & 5' gene expression
S5 10X v(d)j & 5' gene expression
S6 10X v(d)j & 5' gene expression
Xiaojieqiu commented 3 years ago

HI @zhangzlab thanks for the clarification! I would like to follow up one more question, for samples M1-S6, did you also capture the transcriptome via 5' capture in addition to the vdj assay?

Thanks

zhangzlab commented 3 years ago

HI @zhangzlab thanks for the clarification! I would like to follow up one more question, for samples M1-S6, did you also capture the transcriptome via 5' capture in addition to the vdj assay?

Thanks

Yes, that's right.

Xiaojieqiu commented 3 years ago

this is great. thanks