Use std::sort instead of QSortInt

fuji8 commented 2 years ago

I profiled the T1050 with the parameters run by AlphaFold. (Only -cpu is changed). I used Score-P as the profiling tool and got the following results.

From this image, we can see that QSortInt in mergeHitsToQuery is taking a long time. With fast enough storage, my hhblits for this condition is about 2100sec, and QSortInt accounts for about 40%.

Instead of this QsortInt, use std::sort.

I ran hhblits installed by conda and using std::sort under the same conditions as before. In order to avoid I/O effects, I analyze the difference in execution time between the logs that contain this change, instead of the overall execution time. (From https://github.com/soedinglab/hh-suite/blob/ac765987bd0daceb093a41a8e2850887dad5835f/src/hhblits.cpp#L1028-L1030 to https://github.com/soedinglab/hh-suite/blob/b100bb05fe6b920a7cbe58256eef72954d9088bf/src/hhhmm.cpp#L2337)

	conda	use `std::sort`
iteration 1	1232(sec)	477(sec)
iteration 2	631(sec)	374(sec)
iteration 3	306(sec)	232(sec)

This reduced the execution time. I also ran it using the parallelization policy, but the results were not significantly different from std::sort.

This change is due to the different stability of sort, so the execution results may not truly match.

milot-mirdita commented 2 years ago

Cool, thank you!

We have implemented a similar fix in MMseqs2's version of the same code, but haven’t backported it: https://github.com/soedinglab/MMseqs2/blob/d89fcecf9911a99c45ed81c1c0e5054743debc64/src/alignment/MsaFilter.cpp#L212

Could you repeat the benchmark with a stable sort?

fuji8 commented 2 years ago

Thank you for the reply.

I changed the sort to stable_sort and ran it 3 times on hpc in the following environment.

cpu: 28 cores
RAM: 235GB

	1	2	3
iteration 1	488(sec)	484(sec)	484(sec)
iteration 2	374(sec)	371(sec)	376(sec)
iteration 3	223(sec)	225(sec)	218(sec)

Because of the large memory, the computational complexity is probably Nlog(N).

martin-steinegger commented 2 years ago

@fuji8 This looks great! Thank you for the PR. Would it be possible to avoid the lambda expression in the sort?

fuji8 commented 2 years ago

I apologize for the delay in response.

I rewrote the code to be almost equivalent without using the lambda expression. I ran it only once, just to be sure.	no lambda
iteration 1	500(sec)
iteration 2	385(sec)
iteration 3	232(sec)

soedinglab / hh-suite

Use std::sort instead of QSortInt #307