Open mgravell opened 5 months ago
As you asked in the other thread, as soon as the cache transitions to disk, a call to e.g. RMWAsync, does not return a completed ValueTask, but a pending one, that's why I added the ForcedDisk tests, e.g. if you put a breakpoint into the Awaited
for RMWAsync method and execute the test SlidingExpirationForceDiskAsync
you see, that it's getting triggered, this basically means that the Async and Sync methods are not equivalent, as the entries which are forced to disk will not return synchronously, but asynchronously. A bit of looking through the tsavorite code, the reason appears to be WaitForFlushOrIOCompletionAsync
which kinda makes sense that this is async.
with this branch, I can see numbers simply in the tests; normal tests (non force-disk), typical:
Hit: 3
Miss: 2 (of which 0 were expired)
Sync: 7
Async: 0
Copy-Update: 0
Other: 2
Fault: 0
force disk (sync):
Hit: 42
Miss: 2 (of which 1 were expired)
Sync: 61
Async: 0
Copy-Update: 18
Other: 17
Fault: 0
force disk (async):
Hit: 42
Miss: 2 (of which 1 were expired)
Sync: 45
Async: 16
Copy-Update: 16
Other: 17
Fault: 0
so I think it is working correctly with the code in this branch...
Maybe I'm missing something here. I'm not arguing that the sync approach is not working, I'm arguing the point that the async api does async work on the disk transition and the sync api does the "same" work sync and if you have a purely async storage mechanism (Faster has Azure blob storage Backend pkg), it would now be sync with all the usual downsides. This might get worse, if tsavorite ever gets to use the new RandomAccess Apis.
To really understand the differences, tsavorite would need their own Event Counters in the individual async/sync paths for iocompletion.
I do not claim much knowledge about anything over the boundary; I thought that was the entire point of IsPending
, so: as long as we don't wait synchronously, we're golden. Have I misunderstood? context: https://github.com/microsoft/FASTER/issues/907#issuecomment-2028893336
I don't claim to understand much of it either, what I currently see: The different cases:
The first 4 cases make sense from the async codepath.
The Sync Code paths behave differently:
[MethodImpl(MethodImplOptions.AggressiveInlining)]
internal Status ContextRead<Input, Output, Context, TsavoriteSession>(ref Key key, ref Input input, ref Output output, Context context, TsavoriteSession tsavoriteSession, long serialNo)
where TsavoriteSession : ITsavoriteSession<Key, Value, Input, Output, Context>
{
var pcontext = new PendingContext<Input, Output, Context>(tsavoriteSession.Ctx.ReadCopyOptions);
OperationStatus internalStatus;
var keyHash = comparer.GetHashCode64(ref key);
do
internalStatus = InternalRead(ref key, keyHash, ref input, ref output, context, serialNo, ref pcontext, tsavoriteSession);
while (HandleImmediateRetryStatus(internalStatus, tsavoriteSession, ref pcontext));
var status = HandleOperationStatus(tsavoriteSession.Ctx, ref pcontext, internalStatus);
Debug.Assert(serialNo >= tsavoriteSession.Ctx.serialNum, "Operation serial numbers must be non-decreasing");
tsavoriteSession.Ctx.serialNum = serialNo;
return status;
}
That while loop leads to some fancy Thread.Yield() logic in HandleImmediateRetryStatus
, which I think is basically a self-made state-machine to wait (via Yield()) until the external IO operation is complete.
The async code path internally calls the sync methods too, which at some point lead to the same async IO logic.
Conclusion: Not sure, I surely don't understand the code enough, but from my POV, it looks like your approach is good enough, it just bypasses that async wrapping logic. But honestly only @badrishc or @TedHartMS can answer that in any fashion. As @badrishc said you should go that route, I'd follow that and blame him if it does not scale properly.
It sure simplifies the code paths alot.
You should definitely prefer the sync + if IsPending route. We've benchmarked this to be clearly superior to using the async APIs directly. The async was designed with some prefix consistency requirements that made it unnecessarily complicated. We plan to simplify this in Tsavorite. For now, I say go with this PR. Thanks.
Experiment; rather than using
WhateverAsync(...)
:Whatever(...)
Status.IsPending
: iffalse
, use sync completion pathasync Awaited
wrapper that doesawait session.CompletePendingWithOutputsAsync(token: token);
then fetches the inner status etcThe hope is that this side-steps the allocations from https://github.com/microsoft/FASTER/issues/907