Clinical-Genomics / BALSAMIC

Bioinformatic Analysis pipeLine for SomAtic Mutations In Cancer
https://balsamic.readthedocs.io/
MIT License
44 stars 16 forks source link

Testing of Balsamic v8.2 on stage #818

Closed ashwini06 closed 2 years ago

ashwini06 commented 2 years ago

Background

Balsamic version 8.2.1 is the latest release. Before deploying it to production, this newest release needs to be adequately tested with different analysis types for various validation cases

balsamic --version

balsamic, version 8.2.1

Steps to reproduce

  1. Allocate stage resources for balsamic-stage via paxa
  2. Change Conda environment: conda activate S_BALSAMIC
  3. Install the latest version of balsamic: pip install --no-cache-dir -U git+https://github.com/Clinical-Genomics/BALSAMIC
  4. Check the version installed: balsamic --version
  5. Free up the resources for paxa
  6. Generate balsamic cache: balsamic init --outdir /home/proj/stage/cancer/balsamic_cache --cosmic-key ${COSMIC_KEY} --genome-version hg19 --account development -r
  7. Run the validation cases through CG command line: cg workflow balsamic start $caseid -r
  8. Check the status of the runs on trailbazer_stage
  9. When the analysis is finished. Store cases in housekeeper: cg workflow balsamic store $caseid

Testing on Validation cases

Analysis run successfully finished

Storing of results in housekeeper

ashwini06 commented 2 years ago

I will close this issue, when the cg store part is also finished

hassanfa commented 2 years ago

I accidentally opened and closed it again...

ashwini06 commented 2 years ago

Need to update housekeeper tags in hermes. Create hermes PR and update tags here

Update tags for PANEL TUMOR_ONLY CASE

panel_tonly

Update tags for PANEL TUMOR_NORMAL CASE

panel_tn

Update tags for PANEL UMI TUMOR_NORMAL CASE

Screenshot 2021-11-12 at 17 01 20

Update tags for PANEL UMI TUMOR_ONLY CASE umi_TN

Update tags for WGS TUMOR_ONLY CASE.

Screenshot 2021-11-12 at 17 03 43

Update tags for WGS TUMOR_NORMAL CASE

wgs_TN

CNVkit is run on only panel cases, remove those from common tags

ashwini06 commented 2 years ago

Type of Deliverables in housekeeper:

Panel Cases

QC and reports

*_report.html
.json
_BALSAMIC_8.2.0_graph.pdf
multiqc_report.html
multiqc_data.json
fastp.json
fastp.html

Raw data files and Aligned files

concatenated_*.fp.fastq.gz
tumor.merged.bam
tumor.merged.bam.bai
tumor.merged.cram
tumor.merged.cram.crai

If TN analysis:

normal.merged.bam
normal.merged.bam.bai
normal.merged.cram
normal.merged.cram.crai

Variant-called vcf files

Somatic-callers SNV/INDELs

.vardict.all.filtered.vcf.gz
.vardict.all.filtered.vcf.gz.tbi
.vardict.all.filtered.pass.vcf.gz
.vardict.all.filtered.pass.vcf.gz.tbi
.vardict.all.vcf.gz_summary.html
.vardict.all.stats
.TNscope_umi.all.filtered.vcf.gz
.TNscope_umi.all.filtered.vcf.gz.tbi
.TNscope_umi.all.filtered.pass.vcf.gz
.TNscope_umi.all.filtered.pass.vcf.gz.tbi
.TNscope_umi.all.vcf.gz_summary.html
.TNscope_umi.all.stats
.tnhaplotyper.all.filtered.vcf.gz, 
.tnhaplotyper.all.filtered.vcf.gz.tbi
.tnhaplotyper.all.filtered.pass.vcf.gz,
.tnhaplotyper.all.filtered.pass.vcf.gz.tbi
.tnhaplotyper.all.vcf.gz_summary.html
.tnhaplotyper.all.stats

Somatic-callers SV

.delly.all.filtered.pass.vcf.gz
.delly.all.filtered.pass.vcf.gz.tbi
.delly.all.vcf.gz
.delly.all.vcf.gz.tbi
.delly.all.vcf.gz_summary.html
.delly.all.stats
.manta.all.filtered.pass.vcf.gz
.manta.all.filtered.pass.vcf.gz.tbi
.manta.all.vcf.gz
.manta.all.vcf.gz.tbi
manta.all.vcf.gz_summary.html
.manta.all.stats

Somatic-callers CNV

cnvkit.all.vcf.gz
cnvkit.all.vcf.gz.tbi
.cnvkit.all.filtered.pass.vcf.gz
.cnvkit.all.filtered.pass.vcf.gz.tbi
tumor.merged-scatter.pdf
tumor.merged-diagram.pdf
tumor.merged.cns
tumor.merged.cnr
gene_metrics
.gene_breaks
.cnvkit.all.vcf.gz_summary.html
.cnvkit.all.stats

Germline callers

SNV germline

.haplotypecaller.vcf.gz
.haplotypecaller.vcf.gz.tbi
haplotypecaller.vcf.gz_summary.html
.germline.tumor.haplotypecaller.all.stats

.dnascope.vcf.gz
.dnascope.vcf.gz.tbi
.dnascope.vcf.gz_summary.html
.tumor.dnascope.all.stats

SV germline

.manta_germline.vcf.gz
.manta_germline.vcf.gz.tbi
.manta_germline.vcf.gz_summary.html
.manta_germline.all.stats

@ivadym: These are required panel outputs for storing in hk. I will update for WGS cases in a while.

ashwini06 commented 2 years ago

WGS cases:

QC and reports and Raw data files and Aligned files remain the same as the panel cases.

Variant-called vcf files

Somatic SNV

tnscope.all.vcf.gz, (HK)
.tnscope.all.vcf.gz.tbi, (HK)
.tnscope.all.filtered.pass.vcf.gz, (HK)
.tnscope.all.filtered.pass.vcf.gz.tbi, (HK)
.tnhaplotyper.all.vcf.gz, (HK)
.tnhaplotyper.all.vcf.gz.tbi, (HK)
.tnhaplotyper.all.filtered.vcf.gz,
.tnhaplotyper.all.filtered.vcf.gz.tbi,
.tnhaplotyper.all.filtered.pass.vcf.gz, (HK)
.tnhaplotyper.all.filtered.pass.vcf.gz.tbi, (HK)

Somatic SV

.manta.all.vcf.gz, (HK)
.manta.all.vcf.gz.tbi, (HK)
.manta.all.filtered.pass.vcf.gz, (HK, Scout)
.manta.all.filtered.pass.vcf.gz.tbi, (HK, Scout)
.delly.all.filtered.pass.vcf.gz, (HK)
.delly.all.filtered.pass.vcf.gz.tbi, (HK)
.delly.bcf, (HK)
.delly.bcf.csi (HK)

Somatic CNVs

.ascat.all.vcf.gz, (HK)
.ascat.all.vcf.gz.tbi, (HK)
.ascat.all.filtered.pass.vcf.gz, (HK)
.ascat.all.filtered.pass.vcf.gz.tbi, (HK)

Germline SV

.normal.manta_germline.vcf.gz, (HK)
.normal.manta_germline.vcf.gz.tbi, (HK)
.tumor.manta_germline.vcf.gz, (HK, Scout)
.tumor.manta_germline.vcf.gz.tbi,  (HK, Scout)

Germline SNV

.normal.dnascope.vcf.gz, (HK)
.normal.dnascope.vcf.gz.tbi, (HK)
.tumor.dnascope.vcf.gz, (HK)
.tumor.dnascope.vcf.gz.tbi, (HK) 

also include:

*.all.stats
*__summary.html

Additional new WGS plots to include

/home/proj/stage/cancer/cases/$wgs_caseid/analysis/vcf/*.ascat.ascatprofile.png (HK)
/home/proj/stage/cancer/cases/$wgs_caseid/analysis/vcf/*.ascat.ASPCF.png (HK)
/home/proj/stage/cancer/cases/$wgs_caseid/analysis/vcf/*.ascat.germline.png (HK)
/home/proj/stage/cancer/cases/$wgs_caseid/analysis/vcf/*.ascat.rawprofile.png (HK)
/home/proj/stage/cancer/cases/$wgs_caseid/analysis/vcf/*.ascat.sunrise.png (HK)
/home/proj/stage/cancer/cases/$wgs_caseid/analysis/vcf/*..ascat.tumor.png (HK)
/home/proj/stage/cancer/cases/$wgs_caseid/analysis/vcf/*.ascat.samplestatistics.txt (HK)

@khurrammaqbool : Can you please check the above list of WGS deliverables and remove the files you think are not necessary.

khurrammaqbool commented 2 years ago

@ashwini06 we are delivering files mainly from vep folder. I suggest we should include files before vep from vcf folder as well. Those are unfiltered and may be requested to be looked into for variants missing in filtered, vep annotated outputs.

@khurrammaqbool I am not sure how the unannotated .bcf file will be supported by the scout delivery. As of now, we are only delivering annotated vcfs and if in case the customer complains of missing variants, maybe you can look back at .bcf file (to check if the variant is initially called by the variant caller or not) and adjust the filters accordingly and upload the new vcf file (with lowered filters) to the scout? Tagging @hassanfa: If he can share his experience