Open keijokapp opened 1 month ago
Thanks! It looks like I can't merge it because required statuses haven't finished for some reason. @jzhou77 can you advise please?
Thanks! It looks like I can't merge it because required statuses haven't finished for some reason. @jzhou77 can you advise please?
I closed and reopened this PR to kick the CI.
Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; exit 1; fi. Reason: exit status 1
Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; exit 1; fi. Reason: exit status 1
Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; exit 1; fi. Reason: exit status 1
Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; exit 1; fi. Reason: exit status 1
Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; exit 1; fi. Reason: exit status 1
Looks like code format issues. Please use clang-format
version 15 to fix the issue.
Would it be okay to squash the commits together? Would be less noise.
Yup, go ahead
On Thu, May 30, 2024 at 1:11 PM Keijo Kapp @.***> wrote:
Would it be okay to squash the commits together? Would be less noise.
— Reply to this email directly, view it on GitHub https://github.com/apple/foundationdb/pull/11424#issuecomment-2140789726, or unsubscribe https://github.com/notifications/unsubscribe-auth/AARO7B3CD5E3AFSRH657GO3ZE6BWXAVCNFSM6AAAAABIETZHNOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNBQG44DSNZSGY . You are receiving this because you commented.Message ID: @.***>
N/A
N/A
Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1
Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1
N/A
N/A
N/A
N/A
N/A
Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1
N/A
Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1
Amended with the following patch. Otherwise that key would be outside the overall range, making this specific key unreadable.
diff --git a/fdbclient/ReadYourWrites.actor.cpp b/fdbclient/ReadYourWrites.actor.cpp
index 070c3456f..25ffe8cf6 100644
--- a/fdbclient/ReadYourWrites.actor.cpp
+++ b/fdbclient/ReadYourWrites.actor.cpp
@@ -2273,7 +2273,7 @@ void ReadYourWritesTransaction::atomicOp(const KeyRef& key, const ValueRef& oper
// k is the unversionstamped key provided by the user. If we've filled in a minimum bound
// for the versionstamp, we need to make sure that's reflected when we insert it into the
// WriteMap below.
- transformVersionstampKey(k, tr.getCachedReadVersion().orDefault(0), 0);
+ transformVersionstampKey(k, tr.getCachedReadVersion().map([](Version v) { return v + 1; }).orDefault(0), 0);
}
if (operationType == MutationRef::SetVersionstampedValue) {
Nice catch. Did a simulation test find it? The simulation tests that failed in the CI run didn't seem directly related, and my not-very-confident understanding is that this should only show up as a key unexpectedly unreadable within the same transaction, so I'm kind of confused.
Found it by manual white-box testing while trying to solve an another (slightly related) issue.
Since I failed to find a solution to that other problem, it might be good to outline it here.
Applying an arbitrary default value to some state (ie tr.getCachedReadVersion().orDefault(0)
) makes me naturally cautious. If the special case of applying the default value isn't balanced out by some special case handling later, then it's likely a bug (or an abstraction leak). In this case, there isn't any special handling later to fix those 0
values. The abstraction leak is that the behavior is different depending on whether or not the read version is known the time of setting the version stamped key. (Example below.)
I'm not too familiar with the code base. When the read version becomes known (TransactionState::readVersionFuture
gets resolved), the implementation should go over SetVersionstampedKey
mutations and adjust the unreadable ranges accordingly in the WriteMap
. But retrieving a read version is abstracted away from ReadYourWritesTransaction
. I guess callling atomicOp(SetVersionstampedKey)
could set a flag on ReadYourWritesTransaction
when the ranges need to be adjusted so the next read call can do that.
Or alternatively, maybe entries without a known key range should not be inserted to WriteMap
in the first place. It honestly goes over my head. Like what happens if the same version stamped key is set before knowing the read version and after knowning a read version? Will there be two entries in the WriteMap - one with 0
and another with current read version?
Example:
// ok
test('setting a version stamped key before knowing the read version', async () => {
await db.doTn(async tn => {
await tn.getReadVersion();
tn.setVersionstampedKeyRaw(Buffer.alloc(14), '');
await tn.get(Buffer.alloc(14));
});
});
// fails with "Read or wrote an unreadable key"
test('setting a version stamped key without knowing the read version', async () => {
await db.doTn(async tn => {
tn.setVersionstampedKeyRaw(Buffer.alloc(14), '');
await tn.get(Buffer.alloc(14));
});
});
The abstraction leak is that the behavior is different depending on whether or not the read version is known the time of setting the version stamped key.
You're totally right. The semantics of this feature are quite weak - looks like the property tested in FuzzApiCorrectnessWorkload is just "it's always legal for a read to throw error_code_accessed_unreadable".
I think what we need is a (more) precise reference model for RYWTransaction behavior, and lighter-weight fuzz testing using that model. Currently the tests bring up a simulated fdb cluster, which limits fuzzing throughput. This would be a significant amount of work, so I don't want to block this PR on that.
The de facto workaround for this I believe is to insert entities with a versionstamped key in their own small transaction, and then interact with it in a followup transaction. Not very pretty.
Actually, I think this might be close to what I was picturing: https://github.com/apple/foundationdb/blob/3c4a585fb0ee62d17eba2284242c470167c544d6/fdbserver/workloads/Unreadable.actor.cpp#L354.
N/A
N/A
N/A
Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1
N/A
N/A
Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1
As explained in this forum post, the range that gets marked as unreadable should start from the next possible read version, not the current one. That's because the final version stamp can't possibly start with the current read version. The
+1
should be added to the read version if it's known.The test script (JS):
Code-Reviewer Section
The general pull request guidelines can be found here.
Please check each of the following things and check all boxes before accepting a PR.
For Release-Branches
If this PR is made against a release-branch, please also check the following:
release-branch
ormain
if this is the youngest branch)