I also have an idea for a follow-up (mainly intended for the core-team):
I think we had at some point some the discussion how we should make enumerating the bins accessible to the user. At that time it wasn't really clear to us how one could approach this.
I tried to implement a bit_index_view which returns the bit positions at which a bit is 1. It turned out that this produced DOUBLE the amount of asm compared to the current nested for loop (60 instructions instead of 30). Furthermore, the implementation needs at least 80 additional LOC just to make it an iterator. (I already knew that you need to pay more when flattening a nested for loop, but effectively doubling the runtime?!)
Therefore, I would suggest offering the following API function:
This is a bit unconventional in the sense that we pass in a callback, and normally we would prefer a view, but as I stated above it won't be as performant. Having a callback will behave similar to a view, in the sense that we iterate over all possible values on the fly, but with the drawback of no control over the control flow (stopping/starting at a certain element, etc...)
Another approach would be to return a std::vector<size_t>, but this is undesirable in this case as this will be in the hot-loop and any additional heap allocation should be avoided.
Taken from https://github.com/seqan/seqan3/pull/2930
I also have an idea for a follow-up (mainly intended for the core-team):
I think we had at some point some the discussion how we should make enumerating the bins accessible to the user. At that time it wasn't really clear to us how one could approach this.
I tried to implement a
bit_index_view
which returns the bit positions at which a bit is 1. It turned out that this produced DOUBLE the amount of asm compared to the current nested for loop (60 instructions instead of 30). Furthermore, the implementation needs at least 80 additional LOC just to make it an iterator. (I already knew that you need to pay more when flattening a nested for loop, but effectively doubling the runtime?!)Therefore, I would suggest offering the following API function:
This is a bit unconventional in the sense that we pass in a callback, and normally we would prefer a view, but as I stated above it won't be as performant. Having a callback will behave similar to a view, in the sense that we iterate over all possible values on the fly, but with the drawback of no control over the control flow (stopping/starting at a certain element, etc...)
Another approach would be to return a
std::vector<size_t>
, but this is undesirable in this case as this will be in the hot-loop and any additional heap allocation should be avoided.