Closed arosenfeld closed 8 years ago
Wow, that's surprising. Thanks v much Aaron.
At one point I used a MAX4
macro but removed it since it (in theory) doubly evaluated MAX3(x,y,z)
in MAX2
. I guess I didn't trust the compiler to spot the double evaluation. Now I feel foolish. Making max4()
function static inline
may also have given the same speed up. There's a lesson here about profiling before optimising.
Merged.
The
max4
function appears to take a large amount of time for a simple operation. With an input file containing 6,596 alignments, running:time ./bin/needleman_wunsch --file test.fasta --wildcard N 1 --printfasta
Gives:
43.66s user 0.01s system 99% cpu 43.668 total
Gprof shows:
So a bit under half the total time is spent in that function. After this change the same command gives:
26.39s user 0.01s system 99% cpu 26.408 total