caetanosauer / zero

Fork of the Shore-MT storage manager used by the research project Instant Recovery
Other
29 stars 10 forks source link

Assertion failure when using btree_populate_records(). #11

Closed llersch closed 9 years ago

llersch commented 9 years ago

When using btree_populate_records(...) for tests, there is an assertion failure at btree_page_h.cpp:1625, the method is: suggest_fence_for_split(...).

The assertion that fails is at the end of the method:

w_assert0(sep_key != NULL);

Inside suggest_fence_for_split(...), there are the following lines:

    // first, pick the center point according to the skewness of past insertions to this page.
    slotid_t center_point = (nrecs() / 2); // usually just in the middle
    if (is_insertion_skewed_right()) {
        // last 5 inserts were on right-most, so let split on right-skewed point (90%)
        center_point = (nrecs() * 9 / 10);
    } else if (is_insertion_skewed_left()) {
        // last 5 inserts were on left-most, so let split on left-skewed point (10%)
        center_point = (nrecs() * 1 / 10);
    }

btree_populate_records(...) inserts new key-value pairs in descending order. For some reason, at some point the lines above are executed with nrecs() = 4 and is_insertion_skewed_left() = true. This causes center_point = 0

With center_point = 0, the execution never goes through the for-loop in the following code of suggest_fence_for_split():

// second, consider boundaries around the center point to pick the shortest separator key
    slotid_t start_point = (center_point - (nrecs() / 10) > 0) ? center_point - (nrecs() / 10) : 1;
    slotid_t end_point = (center_point + (nrecs() / 10) + 1 <= nrecs()) ? center_point + (nrecs() / 10) + 1 : nrecs();
    size_t sep_length = SM_PAGESIZE;
    const char *sep_key = NULL;
    right_begins_from = -1;
    for (slotid_t boundary = start_point; boundary < end_point; ++boundary) {
        if (is_leaf()) {
            // if we are splitting a leaf page, we are effectively designing a new separator key
            // which will be pushed up to parent. actually, this "mid" fence key is re-used
            // when we adopt a separator key later.
            size_t len1, len2;
            const char* k1 = _leaf_key_noprefix (boundary - 1, len1);
            const char* k2 = _leaf_key_noprefix (boundary, len2);
            // we apply suffix truncation here. We want a short separator key such that
            //   k1 < newkey <= k2.
            // For example, let k1=FEAR, k2=FFBC. new key can be
            //   FF, FEB, FFBC but not FFC or FEAR.
            // the newkey should send entries exclusively smaller than it to left,
            // inclusively larger than it to right. (remember, low fence key is inclusive (>=))

            // take common leading bytes +1 from right.
            // in above example, "FF". (common leading bytes=1)
            size_t common_bytes = w_keystr_t::common_leading_bytes((const unsigned char *) k1, len1, (const unsigned char *) k2, len2);
            w_assert1(common_bytes < len2); // otherwise the two keys are the same.
            // Note, we assume unique indexes, so the two keys are always different
            if (common_bytes + 1 < sep_length
                || (common_bytes + 1 == sep_length && boundary == center_point)) { // if tie, give a credit to center_point
                right_begins_from = boundary;
                sep_length = common_bytes + 1;
                sep_key = k2;
            }
        } else {
            // for interior node, just return the existing key (we can shorten it though).
            size_t len;
            const char *k = _node_key_noprefix (boundary, len);
            if (len < sep_length
                || (len == sep_length && boundary == center_point)) { // if tie, give a credit to center_point
                right_begins_from = boundary;
                sep_length = len;
                sep_key = k;
            }
        }
    }

In other words, sep_key value is initialized as NULL and never changed, causing the assertion to failure right after this for-loop.

Work-around: by changing btree_populate_records(...) to insert records in ascending order, this assertion failure does not occur.

caetanosauer commented 9 years ago

Hi Lucas,

could you try changing it to:

    // first, pick the center point according to the skewness of past insertions to this page.
    slotid_t center_point = (nrecs() / 2); // usually just in the middle
    if (nrecs() > 10) {
        if (is_insertion_skewed_right()) {
            // last 5 inserts were on right-most, so let split on right-skewed point (90%)
            center_point = (nrecs() * 9 / 10);
        } else if (is_insertion_skewed_left()) {
            // last 5 inserts were on left-most, so let split on left-skewed point (10%)
            center_point = (nrecs() * 1 / 10);
        }
    }

I.e., only consider skew if there are more than 10 records, since the skew points are picked in multiples of 10 (10% and 90%).