Closed Itolstoganov closed 1 month ago
Hi Ivan,
Thanks - and great that you found this bug!
I approve the PR, but I will let Marcel make the final call.
@marcel, note that auto count = get_count(find(get_hash(it)));
is a bit of a redundant call, since it involves two searches. A faster way to do it would be skipping over all seeds with the same hash and increment the counters differently:
tot_seed_count += count;
tot_seed_count_sq += count^2;
However, this part of the code is only for printing index statistics, therefore it is not crucial for it to be optimised.
Hence I approve.
I’ll merge this so that it can be part of the next release. To be honest, I’ve never used print_diagnostics
, so it being inefficient doesn’t affect me that much, and I guess it’s rarely used in practice anyway.
Fixed several issues in the index statistics
get_count(size_t position)
counts the seed abundance starting from the argument position,get_count
is now used with the first occurrence position