quinlan-lab / STRling

Detect novel (and reference) STR expansions from short-read data
MIT License
61 stars 9 forks source link

Fpr #55

Closed brentp closed 4 years ago

brentp commented 4 years ago

this adds a strling pull_region command that's useful for debugging a certain region as it pulls all mates as well, even if they are spread across the genome.

it also reduces false positives by adjusting some parameters, there is still a single read left for hg002 locus: 12 133294574 133294648 G on GRCh37 that is incorrectly called as an STR containing read.

hdashnow commented 4 years ago

Unfortunately this change causes us to miss most of the true positives. Investigating further.

brentp commented 4 years ago

you'll likely have to update the linear model.

hdashnow commented 4 years ago

Yes, linear model will need to be updated once this is settled. But the linear model doesn't impact detection of the locus, only the size estimate. So not causing the issue I described.

hdashnow commented 4 years ago

Now we're able to recover 12/13 known pathogenic loci that we were detecting previously.

hdashnow commented 4 years ago

Is this ready to merge do you think @brentp?

brentp commented 4 years ago

LGTM