AstraZeneca-NGS / VarDictJava

VarDict Java port
MIT License
127 stars 55 forks source link

Post processing Germline projects #35

Closed ssaif closed 8 years ago

ssaif commented 8 years ago

Hi Vlad,

I would like to run all but the 'Variants' step in post processing for this project, /ngs/oncology/analysis/dev/Dev_0192_HiSeq4000_NormPlasma_WGS/bcbio/final, which has normal-only samples. I commented out the 'Variants' step in the config/run_info.yaml and it ran Seq2C and TargQC but not the full project report. Is there a way to do that? The script is not finding final/sample/*vardict.vcf files (there won't be any) and that I think is where it is interrupting.

Bcbio generated variants are in the datestamp/var/raw/*batch-vardict.vcf files and I am running vcf2txt standlone to generate variants.txt applying minimal filters, to be delivered in txt format.

Thanks, Sakina

vladsavelyev commented 8 years ago

Hi Sakina, why did you create this issue in VarDictJava repository, not Reporting_Suite?

It should run Seq2C and TargQC even if no VCFs were found. I'll take a look when we got the systems up again. By the way, datestamp/var/raw/*batch-vardict.vcf is where post-processing looks for VCFs. But not for normals - it doesn't expect to see batch VCFs for normal samples.

Post-processing with default parameters is not suited for germline calls (you have to set up the filtering parameters carefully to make sure it keeps germline), but you also can't run vcf2txt standalone because the VCFs coming from bcbio do not carry required annotations. Please let me know how are you running vcf2txt and if you had any recommendations from anyone on it.

vladsavelyev commented 8 years ago

Just checked - the TargQC and Seq2C were generated.

Post-processing has interrupted in the end because it did find only normals, and could not generate NGS reports; it'll make it work with only normals.

ssaif commented 8 years ago

Linked (moved) this issue here, https://github.com/AstraZeneca-NGS/Reporting_Suite/issues/52