hartwigmedical / hmftools

Various algorithms for analysing genomics data
GNU General Public License v3.0
181 stars 56 forks source link

GRIPSS 2.3.5 in -germline mode overwrites somatic GRIPSS VCF #388

Closed toddajohnson closed 1 year ago

toddajohnson commented 1 year ago

Following the README instructions for GRIPSS 2.3.5 for "germline mode" after having first created a somatic GRIPSS vcf in the same GRIPSS output directory: -sample SAMPLE_N \ -reference SAMPLE_T \ -germline

GRIPSS created a file name using the reference (really, the tumor ID) and overwrote the somatic VCF.

That seems to be based on a recent change to GRIPSS' VcfWriter.java at line 68: String fileSampleId = config.GermlineMode && !config.ReferenceId.isEmpty() ? config.ReferenceId : config.SampleId;

In the germline mode run's log, it output: genotype info: germline mode ref(0: RK001_R) tumor(1: RK001_T)`

But if I exclude the -reference SAMPLE_T with germline mode, the log output had: tumor genotype info(0: RK001_R), which does not mention that it was a germline mode run.

Is that all expected? I suppose I could just create a sub-directory for the germline GRIPSS output, but thought that I'd mention this in case some of that output is not what you were expecting.

charlesshale commented 1 year ago

We have Gripss somatic and Gripss germline write output to separate directories from Gridss to keep their VCFs separate.

That logging of "tumor genotype" when only one sample is provided even with the '-germline' argument provided is misleading, but it doesn't change the behaviour of Gripps. I will change it for the next release.

Supplying the tumor sampleId & BAM in germline mode doesn't affect the filtering, so perhaps keep this in in the meantime.