cancerit / cgpCaVEManWrapper

Reference implementation of CGP workflow for CaVEMan SNV analysis
http://cancerit.github.io/cgpCaVEManWrapper/
GNU Affero General Public License v3.0
6 stars 3 forks source link

flagging step - implement scatter gather #33

Closed keiranmraine closed 6 years ago

keiranmraine commented 7 years ago

When running under as a single command multi threaded the flagging phase is sub optimal as it only uses a single thread.

Having the caveman_flag subroutine internally split the VCF into even chunks and then running in parallel for the specified thread count would improve this with little impact to surrounding tools. This would not be addressable via -i option as when run in a farm compute manner it's not an issue to run the single thread process for ~60 minutes, just when you are blocking many CPU in a docker, or one shot command usage.

@drjsanger , agree this is a good idea?