Closed Qqwy closed 1 year ago
Yeah, I think we can do that! I'll try it in a few minutes
I'm also open to contribute with a PR if you want; let me know what you'd prefer 😊
Hi @Qqwy, I made the correspondingly changes in #2. Let me know if you have any other suggestions and/or comments.
And all the tests we have at the moment passed locally on my machine. I'll setup some CI workflows later!
PR #2 has been merged but we most likely can remove the released
flag.
PR #3 removed the released
flag and addressed the inconsistent error-handling behaviour in adbc. It should be ready to go. :D
I've tested the code from PR #3 by improving the example on Explorer a little.
See the qqwy-adbc4 branch on my fork of explorer. And here are the changes vs. the jv-adbc
branch
There's two variants of df_experiment
: One that borrows (cleanup then happens when the Erlang GC wants to), and one that steals the stream (cleanup then happens when the Rust code is done with it).
It works great! I think we can merge PR #3 and close this issue :blush:
Awesome, thank you :)
Thank you @Qqwy! And if I understood correctly, does that mean we can choose how to use (or consume) the stream purely in the Explorer's Rust code and we won't have any memory management issue in NIF either we borrow it or steal it in Rust?
Thank you @Qqwy! And if I understood correctly, does that mean we can choose how to use (or consume) the stream purely in the Explorer's Rust code and we won't have any memory management issue in NIF either we borrow it or steal it in Rust?
Yes, exactly! 👍
That's amazing! Thank you :) I'm going to close this issue now.
The current internal structure of the resource returned from a call to
Adbc.Statement.execute_query(statement)
is as following:(And for the other resources it is similar.)
The content of the
NifRes<T>
is behind a pointer (so it stores a*T
) and is managed byenif_alloc
/enif_free
. But this extra layer of indirection might be unneccesary.I believe a
NifRes<T>
could just store theT
directly.This will simplify memory management considerably, since when the resource is passed to a NIF implemented in another project/language, it can take ownership of the data without requiring a call to
enif_free
for cleanup.Cleanup
With this change, we still have full freedom of whether we want to clean up the stream 'eagerly after first use' or 'whenever the GC wants to':
Whenever the Erlang GC wants to
When above steps are followed without extra changes inside either the adbc code or consumer code, this is what happens.
Eagerly after first use
From the consumer side (such as the Rust code inside Explorer), you can simply call the
release
callback that is part of theArrowArrayStream
struct whenever you've used it. The producer will set the release callback tonullptr
to indicate that it was cleaned up and cannot be used again (as per the Arrow spec).