Open YuzhiSun opened 9 months ago
Hi Yuzhi,
The data for this paper is deposited on GEO, at this link: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE178707
If you hit (custom)
under the download link, then you can select a subset of the files you want to download. Assuming you're interested in the T cell data matrices, I think the files you would want are:
GSM5396333_lane1*
- files for a 10x text-based feature matrix for lane 1 cells. This contains both RNA and ATAC information, but I'd recommend just keeping the RNA part since the ATAC peak calls might not line up with lane 2GSM5396337_lane2*
- equivalent for lane 2 cellsGSM*_ADT_counts_lane*.csv.gz
- TF ADT read counts for each laneGSM*_HTO_counts_lane*.csv.gz
- Hashing ADT read counts for each lane (we used for some normalization)GSM5396336_CD4_Peak_matrix.rds.gz
- The peak matrix from our analysis combining lanes 1 + 2, where each barcode has lane1#
or lane2#
prepended to itThere are several other files available from GEO, and if you want to replicate something specifically from the paper a lot of our scripts are present in this repo. Our code_utils/download_data.py
file will download all the raw data you need, but it may be more than you want for your specific use-case.
Hope that helps!
-Ben
Hi @bnprks , Thank you very much for your contribution to multi omics sequencing data, I would like to ask you a question on how to quickly obtain the three omics data matrices you measured. I would like to use the three omics matrices for downstream analysis.
Best regards, Yuzhi