Closed takoyaki65 closed 1 year ago
Dear takoyaki65, thank you for reporting this issue! I think you are correct, and this code should also update hitLeft
. We will try to address this in the next few weeks (it requires rerunning the regression tests).
-Martin
Dear @hirzel, thank you for your response and for looking into this issue. I appreciate your help! However, to make sure that my understanding is correct, I would like to try testing and verifying the fix myself. Thank you again for your support! -Mizokami
I added the *hitLeft |= sibling->leftSpine();
line and tested the code again, and all the tests passed (although this is expected, as the only change was to repair the aggregation value more often). I will continue to investigate if I can find any test cases where the previous code would produce incorrect aggregation results.
tk65:~/documents/sliding-window-aggregators/cpp$ ./build-and-test.sh g++ --version > bin/log.txt 2>&1 g++ -std=gnu++17 -DHAS_UNCAUGHT_EXCEPTIONS=1 -O3 -DNDEBUG -static src/benchmark_driver.cc -o bin/benchmark_driver g++ -std=gnu++17 -DHAS_UNCAUGHT_EXCEPTIONS=1 -O3 -DNDEBUG -static -DCOLLECT_STATS src/benchmark_driver.cc -o bin/benchmark_driver_stats g++ -std=gnu++17 -DHAS_UNCAUGHT_EXCEPTIONS=1 -O3 -DNDEBUG -static src/ooo_benchmark_driver.cc -o bin/ooo_benchmark_driver g++ -std=gnu++17 -DHAS_UNCAUGHT_EXCEPTIONS=1 -O3 -DNDEBUG -static src/data_benchmark.cc -o bin/data_benchmark g++ -std=gnu++17 -DHAS_UNCAUGHT_EXCEPTIONS=1 -ggdb src/test.cc -o bin/test g++ -std=gnu++17 -DHAS_UNCAUGHT_EXCEPTIONS=1 -O3 -DNDEBUG -static src/dynamic_benchmark_driver.cc -o bin/dynamic_benchmark_driver g++ -std=gnu++17 -DHAS_UNCAUGHT_EXCEPTIONS=1 -O3 -DNDEBUG -static src/bulk_evict_benchmark_driver.cc -o bin/bulk_evict_benchmark_driver g++ -std=gnu++17 -DHAS_UNCAUGHT_EXCEPTIONS=1 -O3 -DNDEBUG -static src/bulk_evict_insert_benchmark_driver.cc -o bin/bulk_evict_insert_benchmark_driver cd bin; zip -D benchmark_bin.zip * updating: benchmark_driver (deflated 67%) updating: benchmark_driver_stats (deflated 67%) updating: bulk_evict_benchmark_driver (deflated 65%) updating: bulk_evict_insert_benchmark_driver (deflated 64%) updating: data_benchmark (deflated 65%) updating: dynamic_benchmark_driver (deflated 65%) updating: log.txt (deflated 20%) updating: ooo_benchmark_driver (deflated 65%) updating: test (deflated 78%) sum passed mean passed stddev passed collect passed sawtooth:mincount passed sawtooth:collect passed thirds:mincount passed thirds:collect passed mincount passed argmax passed max passed timestamped-FIFO mincount passed timestamped-FIFO sum passed range query sum passed non-FIFO collect wrt. naive walk passed non-FIFO bloom wrt. naive walk passed non-FIFO with set passed non-FIFO sum with map passed
Thanks for testing this! Would you like to do a PR for it?
I apologize for your inconvenience. I am currently preoccupied with personal matters, such as preparing for my graduate school entrance interviews, seminar presentations, and final exams, so it seems unlikely that I can immediately fulfill your request. However, I am planning to create a counterexample case and include it in a pull request that I aim to send by mid-July.
Sounds good! No rush.
Hello, I have some free time now and I'll start working on the pull request.
I tested btree::Aggregate
with Sum<int>
aggregation function(from AggregationFunctions.hpp
) and min_arity=2
(minumum child node of non-leaf node is 2).
typedef uint64_t timestamp;
const int min_arity = 2;
int main() {
auto fiba_btree =
btree::MakeAggregate<Sum<int>, timestamp, min_arity, btree::finger>()(0);
I tested it by simple example. First, I insert (timestamp, value) = (1, 1), (2, 2), (3, 3), (4, 4)
records into btree::Aggregate
instance.
const int end = 4;
// insert (timestamp, value) = (1, 1), (2, 2), ..., (end, end) into fiba_btree
for (int i = 1; i <= end; ++i) {
std::cout << "insert (" << i << ", " << i << ")...." << std::endl;
fiba_btree.insert(i, i);
std::cout << fiba_btree << std::endl;
}
After that, I evict most recent record in btree::Aggregate
one by one, and check whether aggregation value is calculated correctly.
// aggregation_result: 1 + 2 + ... + end
int aggregation_result = (end + 1) * end / 2;
for (int i = end; i >= 1; --i) {
// check if the aggregation result is correct
int result = fiba_btree.query();
if (result != aggregation_result) {
std::cout << "aggregation result is wrong: " << aggregation_result
<< "(expected) " << result << "(actual)." << std::endl;
} else {
std::cout << "[";
for (int j = 1; j <= i; ++j)
std::cout << j << (j == i ? "" : ", ");
std::cout << "] is aggregated correctly" << std::endl;
}
std::cout << "evict " << i << "/" << i << "... " << std::endl;
fiba_btree.evict(i);
std::cout << fiba_btree << std::endl;
aggregation_result -= i;
}
I executed this test actually. and The result is bellow:
insert (1, 1)....
(1/1)/ 1 root leaf
insert (2, 2)....
(1/1 2/2)/ 3 root leaf
insert (3, 3)....
(1/1 2/2 3/3)/ 6 root leaf
insert (4, 4)....
(
(1/1 2/2)/ 3 leaf left-spine
3/3
(4/4)/ 4 leaf right-spine
)/ 3 root
[1, 2, 3, 4] is aggregated correctly
evict 4/4...
(
(1/1)/ 3 leaf left-spine
2/2
(3/3)/ 3 leaf right-spine
)/ 2 root
FiBA.hpp:1693: FAILED(partial aggregation value is not correct)
(1/1)/ 3 leaf left-spine
terminate called after throwing an instance of 'int'
Aborted
After each modification operation, tree structure is printed. (...)/3
mean one node with aggregation value 3
, 1/2
mean a record with timestamp 1 and value 2,
We can see that after evicting a (4, 4)
record, the rightmost leaf node becomes underflow and rebalance is occurred. But partial aggregation result of leftmost leaf node (that is leftspine) is not fixed.
This error is caused because btree::aggregate
does not repair borrowed siblings aggregation value. So I add 1 line in
Node rebalanceAfterEvict(Node node, bool hitLeft, bool hitRight, Node* toRepair=NULL)` (on line 832 of FiBA.hpp)
After that, I tried this test one more and confirmed that `btree::aggregate`` passed this test correctly.
tk65@DESKTOP-A32HI1Q:~/documents/forked/sliding-window-aggregators/cpp/src$ ./a.out
insert (1, 1)....
(1/1)/ 1 root leaf
insert (2, 2)....
(1/1 2/2)/ 3 root leaf
insert (3, 3)....
(1/1 2/2 3/3)/ 6 root leaf
insert (4, 4)....
(
(1/1 2/2)/ 3 leaf left-spine
3/3
(4/4)/ 4 leaf right-spine
)/ 3 root
[1, 2, 3, 4] is aggregated correctly
evict 4/4...
(
(1/1)/ 1 leaf left-spine
2/2
(3/3)/ 3 leaf right-spine
)/ 2 root
(
(1/1)/ 1 leaf left-spine
2/2
(3/3)/ 3 leaf right-spine
)/ 2 root
[1, 2, 3] is aggregated correctly
evict 3/3...
(1/1 2/2)/ 3 root leaf
(1/1 2/2)/ 3 root leaf
[1, 2] is aggregated correctly
evict 2/2...
(1/1)/ 1 root leaf
(1/1)/ 1 root leaf
[1] is aggregated correctly
evict 1/1...
()/ 0 root leaf
()/ 0 root leaf
# of combine calls: 40
# of nodes repaired: 13
# of repair calls involving a spine: 3
# of repair calls: 8
avg root degree when repairs were made: 2.375
I don't know what I should add a test in test.cc
, because out-of-order insertion/eviction test is implemented and checked by void test_non_fifo_set(uint64_t iterations, uint64_t window_size)
in cpp/src/test.cc
.
So, I will only add *hitLeft |= sibling->leftSpine();
in Node* rebalanceAfterEvict()
(in line 842 on cpp/src/FIBA.hpp
) to my pull request.
I had initially committed directly to the master branch and attempted to create a pull request from there. However, I have now cancelled that pull request and instead made the commits on a separate branch derived from master, from which I will make the new pull request.
Hello,
I was reading the paper "Optimal and General Out-of-Order Sliding-Window Aggregation" and took a look at the FiBA implementation code available on GitHub. While inspecting the code, I came across the function
Node* rebalanceAfterEvict(Node* node, bool* hitLeft, bool* hitRight, Node* toRepair=NULL)
on line 832 of FiBA.hpp.https://github.com/IBM/sliding-window-aggregators/blob/f62d3cf88fbed83957f27d690060a7547936db4f/cpp/src/FiBA.hpp#L841
I noticed that on that line, only the
hitRight
information is being updated, while thehitLeft
information is not being updated. This seems to be a problem, because when rebalancing a node that contains a tuple like(st, 20, t)
(which is stored in the right child of the root node), the sibling node would become the left spine, and therefore its aggregated value would also need to be updated.Could you please explain why only
hitRight
is being updated, and nothitLeft
? Am I missing something here?Thank you in advance for your help!