etal / cnvkit

Copy number variant detection from targeted DNA sequencing
http://cnvkit.readthedocs.org
Other
520 stars 163 forks source link

Hard coded additional GC content filter in fix.py module #738

Closed tsivaarumugam closed 1 year ago

tsivaarumugam commented 2 years ago

Hello,

In cnvkit version 0.9.8 and above, there is a hard coded GC content filter at line number 125 in the "mask_bad_bins" definition in fix.py module commit.

I understand the reason behind the filter and referred the supporting article/publication as well, but this new condition is filters out approximately 10% of our regions of interest.

Is it possible to club this filtering condition with the existing --no-gc flag command line argument and add the hard coded values to params.py module along with preexisting hard code values, so that the filter can be toggled on or off by the user without modifying the code.

Attaching the screenshots for reference.

Kindly go through and let me know.

screenshot of code snippet cnvkits version 0.9.6 cnvkits_branch0 9 6_function_mask_bad_bins

screenshot of code snippet cnvkits version 0.9.8 cnvkits_branch_0 9 8_function_mask_bad_bins

Thanks

tetedange13 commented 2 years ago

Hi @tsivaarumugam,

I took a look and I do not think it is possible to connect this hard-coded GC filter you mention, to --no-gc param => Simply because they do not actionnate the same thing :

But I can probably submit a PR adding a new EXTREME_GC_FRACTION param to params.py, as you suggested => And you will just have to set it to 1 to disable GC-masking of bad bins

Hope this helps ! Have a nice day. Felix.

tsivaarumugam commented 2 years ago

Hi Felix,

Thanks for the reply. Yes the "EXTREME_GC_FRACTION" parameter option sounds reasonable and good to me. could you please do the needful changes in the params.py to make the hardcoded values as tunable command line parameter switch.

Thank you for the help and your valuable time.

Thanks & best regards, Siva T


From: tetedange13 @.> Sent: Wednesday, June 8, 2022 6:54 PM To: etal/cnvkit @.> Cc: Siva Arumugam @.>; Mention @.> Subject: Re: [etal/cnvkit] Hard coded additional GC content filter in fix.py module (Issue #738)

Hi @tsivaarumugamhttps://github.com/tsivaarumugam,

I took a look and I do not think it is possible to connect this hard-coded GC filter you mention, to --no-gc param => Simply because they do not actionnate the same thing :

But I can probably submit a PR adding a new EXTREME_GC_FRACTION param to params.py, as you suggested => And you will just have to set it to 1 to disable GC-masking of bad bins

Hope this helps ! Have a nice day. Felix.

— Reply to this email directly, view it on GitHubhttps://github.com/etal/cnvkit/issues/738#issuecomment-1150161863, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AWERYB56YNFREAQ73OGYUGLVODF5PANCNFSM5VEU4PTQ. You are receiving this because you were mentioned.Message ID: @.***>

etal commented 1 year ago

Hi Felix and Siva, thanks for your attention to this. I like the solution of specifying the thresholds in params.py to ensure they're used consistently across the analysis (#753), rather than specifying a command-line option each time (#752). I'll comment in those PRs.