cancerit / cgpPindel

Cancer Genome Project Insertion/Deletion detection pipeline based around Pindel
http://cancerit.github.io/cgpPindel/
GNU Affero General Public License v3.0
28 stars 5 forks source link

speed up flagging with ItervalTree #78

Open keiranmraine opened 5 years ago

keiranmraine commented 5 years ago

Most of the flagging code is looking for simple "hit" lookups in tabix files. This can be handled in exactly the same way as the input generation speed up.

Will have additional advantages as current code wraps each query with an eval which is expensive:

https://github.com/cancerit/cgpPindel/blob/da79133d7849117eca4ca3a22896d6b1256643b3/perl/lib/Sanger/CGP/PindelPostProcessing/FragmentFilterRules.pm#L339-L349

Should be able to hide this in the reuse_unmatched_normals_tabix and reuse_repeats_tabix functions. Needs to be applied in both FilterRules.pm and FragmentFilterRules.pm.