Closed marcelm closed 10 months ago
For the second attempt, which you say is close to a tie in speed. Do you think the hits_per_ref.reserve(100);
and resizing plays a role here? this line did not change between the two implementations.
That is, with hits_per_ref.reserve(100);
the hash table may never require a resize because it has room for 100 unique references (or already over-allocates the vectors a bit)? While a flat vector would exactly allocate 100 items which may need to be resized more often.
That is, with
hits_per_ref.reserve(100);
the hash table may never require a resize because it has room for 100 unique references (or already over-allocates the vectors a bit)? While a flat vector would exactly allocate 100 items which may need to be resized more often.
I haven’t measured it, but my gut feeling is that reserve()
calls with so small numbers don’t matter. Also, the hash table reserves space only for 100 vectors, but the individual vectors still need to be resized, so perhaps with the hash table, there is even more reallocation.
I had gotten the impression that using a hash table to store found hits (before merging them into NAMs) could be inefficient, so I tried to get rid of the hash table in two different ways.
This PR serves merely as documentation; I will close it immediately.
The code currently uses
robin_hood::unordered_map<unsigned int, std::vector<Hit>>
to store hits.std::vector<std::vector<Hit>>
. This is more than 50% slower overall than using the hash table, presumably because initializing the vector of vectors is slow.std::vector<Hit>
, and then this vector is sorted by reference index withinmerge_hits_into_nams
. This is just a couple of percent slower.