Closed LeiHaoa closed 3 years ago
I'm not really sure what those *.subreads.bam files are. The fastq files we've used to construct the truth set (i.e., the ftp site) is on SRA (they may be on other places that I don't know, but they're definitely on SRA): https://sites.google.com/view/seqc2/home/sequencing.
There are many bam file listed under SRA (SRP162370) (https://www.google.com/url?q=https%3A%2F%2Ftrace.ncbi.nlm.nih.gov%2FTraces%2Fsra%2F%3Fstudy%3DSRP162370&sa=D&sntz=1&usg=AFQjCNG62E1Oiq62rhfooHD98IGfcMwgYg), the link is right under the site you specified. I just do not know what these subread.bam file means and if I can use these bam files for variant calling.
The fastq file can be download successful.
I don't know if I understand the way: I can download the FD_T_1 and FD_N_1, and this contained the raw fastq file, the truth vcf file is under the ftp site??
Thanks!
SEQC-II produced a massive amount of data. I'm not familiar with all of them. The subreads.bam could be single-cell sequencing data, but I'm not sure.
Yes, the FD_T_1 and FD_N_1 represent one of the 21+ pairs of WGS data sets we have used to construct the truth vcf file (in ftp site).
Thanks for your patient answer!!
Hi, I am very interested in the SEQ-II datasets, I found that (https://trace.ncbi.nlm.nih.gov/Traces/sra/?study=SRP162370) contained many subread.bam, like
m54027_171215_191235.subreads.bam
, but I do not know how should I use these data. For example, if I want to use a certain caller to call tumor-normal pair from this dataset, I don't know the corresponding relation of these files.What's more, is the vcf file from (ftp://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/seqc/Somatic_Mutation_WG/release/latest/) represent the truth of the above data?
many thanks!