begeekmyfriend / bplustree

A minimal but extreme fast B+ tree indexing structure demo for billions of key-value storage
MIT License
1.86k stars 315 forks source link

how to do range search? help wanted #9

Closed xiaobogaga closed 5 years ago

xiaobogaga commented 5 years ago

there is a bplus_tree_get_range method there and it turns out this method only returns the first element of bplustree belong the range. However, this is not so great considering range query.

So how to iterator the results of range query, the interfaces which have been provided currently is not enough. How can do that? help wanted::::

begeekmyfriend commented 5 years ago

I did not consider this range search interface closely. But the implementation is quite simple. Since the keys strored in leaf nodes have been sorted, once you get the first element of the leaf node, you can traverse the next leaves until it touches the upside of the range given. The leaves traversed are what you need to query. Therefore it is an O(n) query algorithm.

xiaobogaga commented 5 years ago

Thanks for your provided solution. I found it has been shown how to do range search. the code following in bplus_tree_get_range commened with 'here is the iteration' is the iteration method.

long bplus_tree_get_range(struct bplus_tree *tree, key_t key1, key_t key2)
{
        long start = -1;
        key_t min = key1 <= key2 ? key1 : key2;
        key_t max = min == key1 ? key2 : key1;

        struct bplus_node *node = node_seek(tree, tree->root);
        while (node != NULL) {
                int i = key_binary_search(node, min);
                if (is_leaf(node)) {
                        if (i < 0) {
                                i = -i - 1;
                                if (i >= node->children) {
                                        node = node_seek(tree, node->next);
                                }
                        }
                       // here is the iteration.
                        while (node != NULL && key(node)[i] <= max) {
                                start = data(node)[i];
                                if (++i >= node->children) {
                                        node = node_seek(tree, node->next);
                                        i = 0;
                                }
                        }
                        break;
                } else {
                        if (i >= 0) {
                                node = node_seek(tree, sub(node)[i + 1]);
                        } else  {
                                i = -i - 1;
                                node = node_seek(tree, sub(node)[i]);
                        }
                }
        }

        return start;
}

I will close this issue.