boostorg / histogram

Fast multi-dimensional generalized histogram with convenient interface for C++14
Boost Software License 1.0
313 stars 74 forks source link

Eytzinger search #376

Closed jhcarl0814 closed 1 year ago

jhcarl0814 commented 1 year ago

design decisions (very very questionable):

  1. vec_ is outside of sorted_array_and_binary_search and eytzinger_layout_and_eytzinger_binary_search to make it easier for outsiders to get vec_ without knowing the type of data_structure_ used
  2. so rebuilding the data_structure_ is needed whenever vec_ constructs or changes (forgetting to add copy constructor, move constructor, copy assignment operator and move assignment operator causes hard-to-track bugs), and
  3. sorted_array_and_binary_search and eytzinger_layout_and_eytzinger_binary_search each gets an extra constructor and an extra assign to make variable's move constructor and move assignment operator noexcept.
    • sorted_array_and_binary_search's does re-construction from vec_ and ignores rhs because it's shallow;
    • eytzinger_layout_and_eytzinger_binary_search's does std::move from rhs and ignores vec_ because it has state and the function needs to be noexcept.
HDembinski commented 1 year ago

Thanks for the patch. This changes a lot of things that are unrelated to the feature. Please split this patch and put the cosmetic changes (wrapping min/max etc) in another PR.

HDembinski commented 1 year ago

Don't make the data structure a template argument. It is fine to replace the standard implementation, because the speed is essentially the same and the additional memory overhead is small. The core logic of the eytzinger search should be implemented in a seperate struct or class, as it is now, so that the core algorithm can be tested in isolation (and potentially reused).

Perhaps the vector should be merged into our eytzinger class, although this may have unwanted side effects for the design.

jhcarl0814 commented 1 year ago

@HDembinski Thank you for simplifying all this! Now I can get rid of the four special members and the extra ctor and extra assign.

jhcarl0814 commented 1 year ago

The cosmetic changes (wrapping min/max etc) are split into https://github.com/boostorg/histogram/pull/377 .

HDembinski commented 1 year ago

vec_ is outside of sorted_array_and_binary_search and eytzinger_layout_and_eytzinger_binary_search to make it easier for outsiders to get vec_ without knowing the type of data_structure_ used

We don't allow direct access to the array, so it is fine if we internally change the data structure from vector_type to something else. However, users of variable must be able to look up a bin by index resonably efficiently. This happens through

auto bin(index_type idx) const noexcept { return interval_view<variable>(*this, idx); }

A generic iterator and begin() and end() are generated through mixin classes, which call this interface.

In most cases, users will iterate over the values in their original order, so if we need to make performance trade-offs, this is the case we should focus on. Ideally, we keep only the eytzinger-sorted array and not the original array.

jhcarl0814 commented 1 year ago

@HDembinski According to https://www.hyrumslaw.com/ the users of auto bin(index_type idx) might have already depended on it being O(1). If we remove the original vec_ and modify value_type value(real_index_type i) to use i to locate the bin in the eytzinger complete binary tree, the time complexity becomes O(lg(n)) and will break the previously not documented "interface".

HDembinski commented 1 year ago

@HDembinski According to https://www.hyrumslaw.com/ the users of auto bin(index_type idx) might have already depended on it being O(1). If we remove the original vec_ and modify value_type value(real_index_type i) to use i to locate the bin in the eytzinger complete binary tree, the time complexity becomes O(lg(n)) and will break the previously not documented "interface".

That's all true, however: 1) We do not consider that a breaking change in Boost. Breaking changes are those which make user code not compile anymore and we do not violate any explicit guarantee given in the documentation. The complexity of bin value look-up was not defined in the documentation, which means it is implementation defined and can change at any time. 2) For the small N that we deal with here, the difference between O(1) and O(log(N)) should not be dramatic. 3) In the typical use of a histogram, you need bin index look-up to be very fast because that's performed in a hot loop, but bin value look-up does not happen in a hot loop.

I am more concerned about the increase in memory from holding two arrays. We already need two arrays instead of one in the current implementation, to hold the eytzinger-sorted values and the indices. Adding yet another array makes the implementation unfavorable in my view. The performance increase of eytzinger is not so dramatic that tripling the amount of required memory seems worth it.

If the cons outweigh the pros, I would abandon this change.

HDembinski commented 1 year ago

@jhcarl0814 I have been thinking about this change and after all, the costs do seem to outweigh the benefits. Eytzinger users more storage, iteration over the axis will become slower, and the benefits are only substantial if you make a very large axis that we do not encounter in real life. In many cases, the axis should fit into the L1 cache, and then the Eytzinger layout should not provide any benefits.

jhcarl0814 commented 1 year ago

@HDembinski You are right.

I should have done more study before suggesting this thing. Sorry for taking up your time.

HDembinski commented 1 year ago

I should have done more study before suggesting this thing. Sorry for taking up your time.

No need to apologize. I could equally apologize to you, since I asked you to prepare a PR too early. We could have figured out these caveats without the PR. Thank you anyway, this was very interesting for me.