FoundationDB / fdb-record-layer

A record-oriented store built on FoundationDB
Apache License 2.0
572 stars 102 forks source link

Mismatched boundary keys in FDBRecordStoreIndexTest::boundaryPrimaryKeys #483

Open alecgrieser opened 5 years ago

alecgrieser commented 5 years ago

At least one PR unrelated to the boundary keys work has failed to due an error in this function. The error encountered was:

com.apple.foundationdb.record.provider.foundationdb.FDBRecordStoreIndexTest > boundaryPrimaryKeys() FAILED
    org.opentest4j.AssertionFailedError: expected: <[((-975),(-897)), ((-897),(-819)), ((-819),(-741)), ((-741),(-663)), ((-663),(-585)), ((-585),(-507)), ((-507),(-429)), ((-429),(-351)), ((-351),(-273)), ((-273),(-195)), ((-195),(-117)), ((-117),(-39)), ((-39),(39)), ((39),(78)), ((78),(156)), ((156),(273)), ((273),(390)), ((390),(507)), ((507),(624)), ((624),(741)), ((741),(897)), ((897),(936))]> but was: <[((-975),(-819)), ((-819),(-663)), ((-663),(-507)), ((-507),(-351)), ((-351),(-195)), ((-195),(-39)), ((-39),(78)), ((78),(273)), ((273),(507)), ((507),(741)), ((741),(897)), ((897),(936))]>
        at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:52)
        at org.junit.jupiter.api.AssertEquals.failNotEqual(AssertEquals.java:197)
        at org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:186)
        at org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:181)
        at org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:486)
        at com.apple.foundationdb.record.provider.foundationdb.FDBRecordStoreIndexTest.checkSplitIndexBuildRange(FDBRecordStoreIndexTest.java:2216)
        at com.apple.foundationdb.record.provider.foundationdb.FDBRecordStoreIndexTest.boundaryPrimaryKeys(FDBRecordStoreIndexTest.java:2143)

The two lists are:

[((-975),(-897)), ((-897),(-819)), ((-819),(-741)), ((-741),(-663)), ((-663),(-585)), ((-585),(-507)), ((-507),(-429)), ((-429),(-351)), ((-351),(-273)), ((-273),(-195)), ((-195),(-117)), ((-117),(-39)), ((-39),(39)), ((39),(78)), ((78),(156)), ((156),(273)), ((273),(390)), ((390),(507)), ((507),(624)), ((624),(741)), ((741),(897)), ((897),(936))]
[((-975),(-819)), ((-819),(-663)), ((-663),(-507)), ((-507),(-351)), ((-351),(-195)), ((-195),(-39)), ((-39),(78)), ((78),(273)), ((273),(507)), ((507),(741)), ((741),(897)), ((897),(936))]

The exact test case failing is here:

https://github.com/FoundationDB/fdb-record-layer/blob/decb059c2e0d7b55c0f922a341ecd270ef334056/fdb-record-layer-core/src/test/java/com/apple/foundationdb/record/provider/foundationdb/FDBRecordStoreIndexTest.java#L2143

It's hard to say exactly what happened here, but I think the most likely scenario is that there was some amount of data movement including shard splitting that occurred between the time the expected list was created and the time that the final list was created. In particular, one will notice that every two splits in the first list are combined into single splits in the second list except for the last few. If I'm reading the test cases here correctly, the boundaryPrimaryKeysSize variable does not include the endpoints, so boundaryPrimaryKeysSize + 1 is the number of distinct shards in the range, i.e., the number of ways that the index build range will be split. Now, suppose that the (741) to (897) shard is split, let's say at (800). Then when we look at the logic that splits up the key range into boundaries:

https://github.com/FoundationDB/fdb-record-layer/blob/decb059c2e0d7b55c0f922a341ecd270ef334056/fdb-record-layer-core/src/main/java/com/apple/foundationdb/record/provider/foundationdb/OnlineIndexer.java#L927-L938

Note that boundaries here does include the endpoints, so boundaries.size() - 1 is the number of unique shards in the key range. With the extra split point at (800), note that boundaries.size() - 1 == boundaryPrimaryKeySize + 2 + 1 - 1 == (boundaryPrimaryKeySize + 1) + 1 == maxSplit + 1. (The 2 comes from the endpoints and the 1 comes from the additional boundary key.) Then from line 927, stepSize == -Math.floorDiv(-(maxSplit + 1), maxSplit) == -1 * -2 == 2. This would then explain why each of the shard splits in the retrieved list of shards encompasses two of the original shards except for the ((741),(897)) shard (where ((741),(800)) and ((800),(897)) are combined) and the last shard, which is from the last shard boundary key to the end in both cases.

This appears to be separate from #466, though that also provides an example of a way that this test could be flaky. I think there was a suggestion somewhere (perhaps living outside of this repo as I couldn't find the appropriate issue) of mocking the boundary key API in order to avoid taking as much time as this test does to load the data as well as to exercise weird codepaths like what if there are no boundary keys or duplicate boundary keys or exactly one boundary, etc. I think mocking might be the most straightforward way to protect against this kind of race condition. It also seems unlikely to me like there is an actual problem with the logic in the OnlineIndexer to create these ranges, i.e., I think this is a testing-only bug.

alecgrieser commented 5 years ago

Oh, and #306 appears to be a PR that failed because of this test.