Closed mdrakiburrahman closed 3 weeks ago
The methods don't have to be synchronous. Technically, C is always synchronous. However, any method that takes a callback function can be wrapped with a TaskCompletionSource
to make it asynchronous in C#. tokio only matters if the underlying rust method is async. If that were the case, then, like the Bridge
code, it would require a runtime or create it upon the method call.
Makes sense @mightyshazam - I had the incorrect understanding that this works through and through due to the Tokio wrapper. But I'm guessing the idea is the Cancellation Token would be respected by that TaskCompletionSource
and abandon whatever C FFI call was being made that's taking too long.
Also, do you prefer I get this TSC wrapper done before merging this PR, so we change the ITable
interface to be completely async?
Or should we have the synchronous methods as the PR currently has, and expose additional async methods?
Makes sense @mightyshazam - I had the incorrect understanding that this works through and through due to the Tokio wrapper. But I'm guessing the idea is the Cancellation Token would be respected by that
TaskCompletionSource
and abandon whatever C FFI call was being made that's taking too long.Also, do you prefer I get this TSC wrapper done before merging this PR, so we change the
ITable
interface to be completely async?Or should we have the synchronous methods as the PR currently has, and expose additional async methods?
We can change the methods to async later. As long as we don't increment the version of the package, it won't push a new one.
@mightyshazam - yeap good point, sounds good, I added this issue to track this:
https://github.com/delta-incubator/delta-dotnet/issues/91
Will get it done in a separate smaller PR
Why this change is needed
This PR adds Delta Kernel FFI based read support.
Closes issue: https://github.com/delta-incubator/delta-dotnet/issues/82
How
Delta Kernel integration
Adds
delta-kernel-rs
as a pinned submodule. Uses the same structure asBridge
to generate the FFI + Rust BinaryInterfaces the user facing entrypoints and sets up some simple model relationships
DeltaEngine
asIEngine
andDeltaTable
asITable
Kernel
to overrideBridge
and fall back when implementation is missingBridge
Runtime
/Table
as base class forKernel
, overrides with the subset of read methodsKernel
exposesImplements Kernel FFI InterOp - most of which is Pointer management
Adds an
Arrow.Table
andDataFrame
method to expose the Kernel scannedRecordBatches**
as a queryableAdds write concurrency and read concurrency tests - to find any memory management problems and test resiliency etc.
Misc
bootstrap-dev-env.sh
idempotent script to quickly get a dev env up and running that can run Unit Tests + Example project (e.g. using a throwaway WSL box)Test
Add a unit test that tests read/write concurrency - 27 concurrent writers, 50 readers.
Stress tested the new read/write concurrency unit test across 2800+ loops on Windows + Linux overnight:
Windows
Linux
Ran cloud write example project (Azure Storage Account)
Tested Nuget Package pipeline and all targes with
cross
- sample run