Clinical-Genomics-Lund / nextflow_wgs

5 stars 5 forks source link

GATK GQC always cast to int to prevent genmod from ignoring it #171

Closed alkc closed 6 months ago

alkc commented 6 months ago

Description

fixes #170

Small change to prescore_sv.pl that always casts GATK GQC vals as ints.

Genmod expects an int and won't process GQC otherwise, which affects about 700 SVs per run where the caller penalty is not correctly set.

This update will enable the caller penalty for those on average 700 variants.

Risk assessment:

Fixing this bug can affect on average around 700 out of 2500 GATK-only per run, looking at the final SV results.

An analysis of confirmed CNVs only called by GATK used in our 2021 validation of the SV rank model confirms that this update is unlikely to affect the rank scores of confirmed variants.

These variants have GQC values > 2000, which means that the caller penalty would have been 0 even in absence of the bug this update fixes.

Type of change

Checklist:

Patch

Instructions for the reviewers

How to test the changes

Expected outcome

[Optional] Additional information

Review

Review performed by:

Testing performed by:

Post-merge

alkc commented 6 months ago

Thanks. I'll run the tests for a wgs single/trio and merge if everything looks ok. Should be able to pull it off before the end of the week.

alkc commented 6 months ago

I checked the intermediate files for a live run and the corresponding sample run through this branch.

No obvious errors, the GQC values are converted to ints as they should. Running genmod score on the updated output results in no GATK Found value None errors for records with GQC defined.

alkc commented 6 months ago

Can confirm that the caller penalty for GATK work as it should now.