DiltheyLab / HLA-LA

Fast HLA type inference from whole-genome data
GNU General Public License v3.0
120 stars 40 forks source link

Slow tests #52

Open Stikus opened 3 years ago

Stikus commented 3 years ago

Hello, thanks for great tool. We're trying to implement it in our studies.

For now tests, described here, working well and our test results are very similar to your GitHub result. But it takes approximately 40 minutes to pass (on NA12878.mini.cram using server with 32 threads and 128 GB RAM) - near 10 minutes for remapping, near 4 minutes for HLA-typing and 25 minutes for intermediate step (I don't fully understand it).

Is it normal? Other HLA-typers we are using (for example xHLA) take a few (2-3) minutes on full WGS sample for typing - that's why I'm asking here. I understand remapping time, but not intermediate step.

Here is our result Test.R1_bestguess_G.txt if it can help to debug.

For some reason there are threads: 1 in all 136 blocks - looks like here is a code line controlling it.

And one more question: according to this line there are some functionality to use only ClassI and 3 ClassII HLA. How can we use this option from HLA-LA.pl run command?

yubau1112 commented 3 years ago

I have same issue wtih you https://github.com/DiltheyLab/HLA-LA/issues/49

jfass commented 3 years ago

Same here. Though I have --maxthreads 24 specified, my log file displays this after the remapping step:

[...]
/opt/samtools/samtools sort -o /tmp/output/Capan1/remapped_with_a.bam /tmp/output/Capan1/remapped_with_a.bam.unsorted
[bam_sort_core] merging from 5 files and 1 in-memory blocks...
/opt/samtools/samtools index /tmp/output/Capan1/remapped_with_a.bam
threads: 1
Done
threads: 1
Done
threads: 1
Done
threads: 1
Done
[...]

... 94 of the "threads: 1\nDone" blocks. And I only see 1 thread being used during that phase. Is there anything to be done about that?

serge2016 commented 3 years ago

Hello. Same here! Any chance for speed up in the nearest feature?

yubau1112 commented 3 years ago

I was trying conda version too, but still very slow. Same with not conda version.

AlexanderDilthey commented 11 months ago

Hi,

The algorithm is not fully multithreaded - I would rather recommend starting multiple jobs in parallel (e.g. with 4 threads) instead of using all available CPUs on one process. The reported runtimes are in link with what I would expect.

Best wishes

Alex

serge2016 commented 7 months ago

Hello! Any new ideas here? Or maybe somebody have written a script to make a test run faster?