Closed frederic-mahe closed 1 year ago
For the fun of it, here is another version that is valid for the whole range of uint64 values (for the price of a branch, but performance is not paramount here):
#include <limits>
constexpr auto divide_into_bytes(unsigned long n_bits) -> unsigned long {
constexpr auto n_bits_in_a_byte = 8;
if (n_bits == 0) { return 0; }
return ((n_bits - 1) / n_bits_in_a_byte) + 1;
}
static_assert(divide_into_bytes(0) == 0, "Error");
static_assert(divide_into_bytes(1) == 1, "Error");
static_assert(divide_into_bytes(7) == 1, "Error");
static_assert(divide_into_bytes(8) == 1, "Error");
static_assert(divide_into_bytes(9) == 2, "Error");
static_assert(divide_into_bytes(16) == 2, "Error");
static_assert(divide_into_bytes(17) == 3, "Error");
static_assert(divide_into_bytes(8 * 10) == 10, "Error");
static_assert(divide_into_bytes(8 * 10 + 1) == 11, "Error");
static_assert(divide_into_bytes(std::numeric_limits<unsigned long>::max()) == 2'305'843'009'213'693'952, "Error");
// 2305843009213693952 = 2^64 / 2^3 = 2^61
Or even simpler:
constexpr auto divide_into_bytes(unsigned long n_bits) -> unsigned long {
constexpr auto n_bits_in_a_byte = 8;
return (n_bits + n_bits_in_a_byte - 1) / n_bits_in_a_byte;
}
yes, but it overflows when n_bits
is greater than ULONGMAX - 7
, hence the assert(n_bits <= std::numeric_limits<unsigned long>::max() - (n_bits_in_a_byte - 1));
in my first version.
Yes, you're right.
I'll try to fix it today.
I am still not sure we need a function, but if we do, we can turn it into a template and use it with different denominators:
#include <limits>
// correct for all values of numerator (0 to max())
// correct for all values of Denominator (1 to max())
// todo: use concepts to generalize for any unsigned int
template<unsigned long Denominator = 8>
constexpr unsigned long divide_int_round_up(unsigned long numerator) {
if (numerator == 0) { return 0; }
return ((numerator - 1) / Denominator) + 1;
}
static_assert(divide_int_round_up(0) == 0, "Error");
static_assert(divide_int_round_up(16 - 1) == 2, "Error");
static_assert(divide_int_round_up(16) == 2, "Error");
static_assert(divide_int_round_up(16 + 1) == 3, "Error");
static_assert(divide_int_round_up<8>(0) == 0, "Error");
static_assert(divide_int_round_up<8>(0) == 0, "Error");
static_assert(divide_int_round_up<8>(0) == 0, "Error");
static_assert(divide_int_round_up<8>(8 * 10) == 10, "Error");
static_assert(divide_int_round_up<8>(std::numeric_limits<unsigned long>::max()) == 2'305'843'009'213'693'952, "Error");
static_assert(divide_int_round_up<1>(std::numeric_limits<unsigned long>::max()) == std::numeric_limits<unsigned long>::max(), "Error");
static_assert(divide_int_round_up<64>(0) == 0, "Error");
static_assert(divide_int_round_up<64>(64 * 10) == 10, "Error");
static_assert(divide_int_round_up<64>(64 * 10 + 1) == 11, "Error");
static_assert(divide_int_round_up<64>(64 * 10 - 1) == 10, "Error");
Here is a C++11 version:
template<unsigned long Denominator = 8>
unsigned long divide_int_round_up(unsigned long numerator) {
static_assert(Denominator != 0, "Denominator can't be null");
if (numerator == 0) { return 0; }
return ((numerator - 1) / Denominator) + 1;
}
that can be tested on compiler explorer+%2B+1%3B%0A%7D%0A%0Aint+main+()+%7B%0A++++return+divide_int_round_up%3C64%3E(16)%3B%0A%7D'),l:'5',n:'0',o:'C%2B%2B+source+%231',t:'0')),k:50,l:'4',n:'0',o:'',s:0,t:'0'),(g:!((g:!((h:compiler,i:(compiler:g122,deviceViewOpen:'1',filters:(b:'0',binary:'1',commentOnly:'0',demangle:'0',directives:'0',execute:'1',intel:'0',libraryCode:'0',trim:'1'),flagsViewOpen:'1',fontScale:14,fontUsePx:'0',j:1,lang:c%2B%2B,libs:!(),options:'-O3+-std%3Dc%2B%2B11+-Wall+-Wextra+-g+-Wshadow+-Wnon-virtual-dtor+-Wold-style-cast+-Wcast-align+-Wunused+-Woverloaded-virtual+-Wpedantic+-Wconversion+-Wsign-conversion+-Wmisleading-indentation+-Wduplicated-cond+-Wduplicated-branches+-Wlogical-op+-Wnull-dereference+-Wuseless-cast+-Wdouble-promotion+-Wformat%3D2',selection:(endColumn:12,endLineNumber:3,positionColumn:12,positionLineNumber:3,selectionStartColumn:12,selectionStartLineNumber:3,startColumn:12,startLineNumber:3),source:1),l:'5',n:'0',o:'+x86-64+gcc+12.2+(Editor+%231)',t:'0')),k:33.333333333333336,l:'4',m:36.809815950920246,n:'0',o:'',s:0,t:'0'),(g:!((h:output,i:(compilerName:'x86-64+gcc+12.2',editorid:1,fontScale:14,fontUsePx:'0',j:1,wrap:'1'),l:'5',n:'0',o:'Output+of+x86-64+gcc+12.2+(Compiler+%231)',t:'0')),header:(),l:'4',m:38.190184049079754,n:'0',o:'',s:0,t:'0'),(g:!((h:executor,i:(argsPanelShown:'1',compilationPanelShown:'0',compiler:g122,compilerOutShown:'0',execArgs:'',execStdin:'',fontScale:14,fontUsePx:'0',j:1,lang:c%2B%2B,libs:!(),options:'',source:1,stdinPanelShown:'1',tree:'1',wrap:'1'),l:'5',n:'0',o:'Executor+x86-64+gcc+12.2+(C%2B%2B,+Editor+%231)',t:'0')),header:(),l:'4',m:25,n:'0',o:'',s:0,t:'0')),k:50,l:'3',n:'0',o:'',t:'0')),l:'2',m:100,n:'0',o:'',t:'0')),version:4)
Fixed in dev now.
The exact size of the Bloom filter is not important as long as we do not access memory outside it. The Bloom filter is just to make the program a bit faster by using some extra memory. If the filter is a bit smaller than intended it has only a negligible effect on the speed. The logic and the results are not affected.
Thanks @torognes ! I'll launch the tests tonight. Once they are done, we can make a new release.
Here is a draft changelog for release v3.1.3:
d
> 1 (bug had no impact on clustering results),readability-braces-around-statements
),(please feel free to modify it as you see fit)
Sounds good. I hope the tests run fine!
hello @torognes !
Test results are good, memory usage is back to normal. I've added a series of tests to check that swarm's results stay strictly identical (same hash value) from one release to another, and when switching compiler. I was already doing something similar, but now it is automatic and systematic.
Please merge from dev and make a new release, at your convenience. Here is a draft changelog for release v3.1.3:
d
> 1 (bug had no impact on clustering results),readability-braces-around-statements
),(please feel free to modify it as you see fit)
Ok, good! Version 3.1.3 has been released.
Thanks @torognes !
I have merged all changes into both master and dev.
In this call to
bloomflex_init()
:https://github.com/torognes/swarm/blob/e47c80ef5a68f93750a0eab0b48c0764ed0a189b/src/algod1.cc#L1335
The first parameter (
m / sizeof(uint64_t)
) is of typeuint64_t
and represents a number of bytes.m
is a number of bits, and then must by divided by 8 to get a number of bytes. However,m
is:bits
can take any value between 2 and 64;microvariants
is a constant value of 7;nucleotides_in_small_otus
can take any value starting from 1. Therefore,m
can be any value, starting from2 x 7 = 14
.Since
m
will be used bybloomflex_init()
to allocatem / sizeof(uint64_t)
bytes, then isn't that an issue to use an integer division?For instance, If
m
is 31 bits, then a 3-byte bitmap (31 / 8
) will be allocated, where in fact 4 bytes would be necessary to hold the 31bits.A possible fix:
or in C++14: