hall-lab / speedseq

A flexible framework for rapid genome analysis and interpretation
MIT License
311 stars 116 forks source link

tumor-normal workflow #129

Open ramcn opened 6 years ago

ramcn commented 6 years ago

Hello,

I am following the workflow described in speedseq paper [1] figure 6. Could you guys please review the below set of commands and validate if the methodology and dataset I am usinng is correct?

  1. Download the tumor/normal sample from illumina basespace [2]. The HC1187C is the tumor sample and HC1187BL is the normal sample.

  2. Align tumor-normal sample using speedseq speedseq align -o hcc.normal -t 56 -R "@RG\tID:TCGA-B6-A0I6-10A-01D-A128-09\tSM:TCGA-B6-A0I6-10A-01D-A128-09\tLB:lib1" human_g1k_v37.fasta HCC1187BL_S1_R1_001.fastq HCC1187BL_S1_R2_001.fastq speedseq align -o hcc.tumor -t 56 -R "@RG\tID:TCGA-B6-A0I6-10A-01D-A128-09\tSM:TCGA-B6-A0I6-10A-01D-A128-09\tLB:lib1" human_g1k_v37.fasta HCC1187C_S1_R1_001.fastq HCC1187C_S1_R2_001.fastq

  3. Create 2 vcf files of somatic and SV variants speedseq somatic -o hcc -t 56 -w ceph18.b37.include.2014-01-15.bed -F 0.05 -q 1 human_g1k_v37.fasta hcc.normal.bam hcc.tumor.bam speedseq sv -o hcc -t 56 -x ceph18.b37.lumpy.exclude.2014-01-15.bed -B hcc.normal.bam,hcc.tumor.bam -D hcc.normal.discordants.bam,hcc.tumor.discordants.bam -S hcc.normal.splitters.bam,hcc.tumor.splitters.bam -R human_g1k_v37.fasta

  4. Annotate the 2 vcf files using snpEff java -Xmx4G -jar snpEff/snpEff.jar -c snpEff/snpEff.config -t GRCh37.75 hcc.vcf > hcc_snpeff.vcf java -Xmx20G -jar snpEff/snpEff.jar -c snpEff/snpEff.config -t GRCh37.75 hcc.sv.vcf > hcc_sv_snpeff.vcf

  5. Merge the 2 vcf files vcf-merge hcc_snpeff.vcf.gz hcc_sv_snpeff.vcf.gz | bgzip -c > tumor.vcf.gz

  6. Load and analyze through GEMINI gemini load -v tumor.vcf -t snpEff --cores 32 tumor.db gemini set_somatic tumor.db gemini actionable_mutations tumor.db

References: [1] https://www.biorxiv.org/content/biorxiv/early/2014/12/05/012179.full.pdf [2] https://basespace.illumina.com/projects/38600562/samples