PacificBiosciences / paraphase

HiFi-based caller for highly similar paralogous genes
BSD 3-Clause Clear License
23 stars 4 forks source link

Question about multi-threading #25

Closed gabeng closed 7 hours ago

gabeng commented 3 days ago

Hi,

I am wondering how to use the multi-threading option. Using this command, paraphase only seems to run on a single core. What dies the -t parameter actually control?

paraphase \
        -t 31 \
        -p sample \
        --genome 38 \
        -o . \
        -r GRCh38.fa \
        -b sample.bam
xiao-chen-xc commented 3 days ago

Hi @gabeng in your case you would request 31 cores - all the Paraphase-targeted regions will be divided into 31 groups, each of which is analyzed by one core. This should be much faster than using a single core.

gabeng commented 2 days ago

I am working with targeted panels and not all regions are uniformly covered in all samples. If one thread is started per region it seems that a specific region takes hours to complete on a single core.

xiao-chen-xc commented 1 day ago

Yes I think more work is needed to optimize Paraphase for targeted data. We are working on some improvements and hopefully they will be included in the next release.