Two failed attempts of speeding up hit finding

ksahlin / strobealign

Aligns short reads using dynamic seed size with strobemers

MIT License

128 stars 16 forks source link

Two failed attempts of speeding up hit finding #335

Closed marcelm closed 10 months ago

marcelm commented 10 months ago

I had gotten the impression that using a hash table to store found hits (before merging them into NAMs) could be inefficient, so I tried to get rid of the hash table in two different ways.

This PR serves merely as documentation; I will close it immediately.

The code currently uses robin_hood::unordered_map<unsigned int, std::vector<Hit>> to store hits.

Commit ede0289 switches this to std::vector<std::vector<Hit>>. This is more than 50% slower overall than using the hash table, presumably because initializing the vector of vectors is slow.
Commit acfb8f3 switches to a "flat" std::vector<Hit>, and then this vector is sorted by reference index within merge_hits_into_nams. This is just a couple of percent slower.

ksahlin commented 10 months ago

For the second attempt, which you say is close to a tie in speed. Do you think the hits_per_ref.reserve(100); and resizing plays a role here? this line did not change between the two implementations.

That is, with hits_per_ref.reserve(100); the hash table may never require a resize because it has room for 100 unique references (or already over-allocates the vectors a bit)? While a flat vector would exactly allocate 100 items which may need to be resized more often.

marcelm commented 10 months ago

That is, with hits_per_ref.reserve(100); the hash table may never require a resize because it has room for 100 unique references (or already over-allocates the vectors a bit)? While a flat vector would exactly allocate 100 items which may need to be resized more often.

I haven’t measured it, but my gut feeling is that reserve() calls with so small numbers don’t matter. Also, the hash table reserves space only for 100 vectors, but the individual vectors still need to be resized, so perhaps with the hash table, there is even more reallocation.