epi2me-labs / wf-human-variation

Other
86 stars 41 forks source link

wrong STR thresholds in STR-report #163

Closed tamara-nano closed 1 month ago

tamara-nano commented 3 months ago

Operating System

Ubuntu 22.04

Other Linux

No response

Workflow Version

v2.0.0-g52e3698

Workflow Execution

Command line

EPI2ME Version

No response

CLI command run

No response

Workflow Execution - CLI Execution Profile

None

What happened?

Hello!

When running the STR-workflow on epi2me I noticed that I've got some wrong thresholds for the pathogenity of some str-expansions in the report. There it seems that the threshold is set to the base count and not the repeat count. These are HG002-data.

F.ex. PABPN1: normal max: 10 repeats, pathogenity starts at 11 repeats. Here the threshold is set at base 11 and not repeat 11, you see in both alleles only 6 repeats (fig: hg002) PABPN1 Also one allele has a wrong repeat-count (?) In the mapping you see 6/6 repeats for both alleles. This happened with a few repeats. It also enforces a repeat count when one allele is missing. pabpn1-rep Mapping of PABPN1 pabpn1-map

RFC1: normal repeat range 11-200 repeats, pathogenity 400 - >2000 repeats (source GeneReviews), also wrong threshold (fig: hg002 with 12/115 repeats for both alleles, ref-hg38 has 11 rep ) rfc1

Please could you have a look into this and check the thresholds?

Else I find this pipeline pretty convenient with a nice report-output! Great work!

Best, Tamara

Relevant log output

-

Application activity log entry

No response

Were you able to successfully run the latest version of the workflow with the demo data?

yes

Other demo data information

No response

vlshesketh commented 3 months ago

Hi @tamara-nano, thank you for your positive feedback on the workflow! I will review the STR reporting script and get back to you shortly.

vlshesketh commented 2 months ago

Hi @tamara-nano apologies for the delay - you are correct that the plotting script is using the wrong value for the normal max. and pathogenic min. in the repeat content plot. I will update the ticket when a fix is released.

vlshesketh commented 1 month ago

Hi @tamara-nano, this has now been fixed in v2.2.0.