Closed benjaminwinger closed 8 months ago
It seems like the issue was that beginWriteTransaction was called during initialization so when it's called the second time it silently fails, and then hangs when the inserts are done.
Regarding the silent failure of Connection::beginWriteTransaction
, partly it's just that Connection::beginWriteTransaction
doesn't actually return a query result, so you can't check that it's successful or see the error message (without querying BEGIN WRITE TRANSACTION
manually), but I think it's also part of a broader issue with the C++ API and how it handles results in general. Using [[nodiscard]]
would probably help, as well as maybe a QueryResult::assertSuccess
method that would turn error messages into exceptions, and in this case I think we would want beginWriteTransaction
to just directly call assertSuccess
rather than producing a result.
The hang on the other hand is contrary to what the error message says here.
With the double call fixed, doing the inserts all as one transaction works identically to how it's currently working (where I guess it's doing it all as one transaction and auto comitting/discarding when the connection goes out of scope).
Doing 4096 explicit (or implicit) transactions is still failing with the buffer manager exception, but it does work if broken up into chunks so that fewer transactions are done (I tested with chunks of 10, so 410 transactions).
I'm linking this issue to this other issue that relates to returning error statuses in the C API: https://github.com/kuzudb/kuzu/issues/2528. These look important to me and should be prioritized as they relate to the usability of our APIs.
Just to document our discussion here.
The problem is because we didn't propagate query result to transaction related APIs. For example, in the following API, we discarded the return result from query
, thus beginReadOnlyTransaction
is not able to report errors to users.
void Connection::beginReadOnlyTransaction() {
query("BEGIN TRANSACTION READ ONLY");
}
The solution is that we should remove following interfaces, beginWriteTransaction
, beginReadOnlyTransaction
, commit
, rollback
from cpp APIs, and let users go through equivalent Cypher statements directly.
When testing the integer bitpacking, I added a test which inserts many nodes at once (
NodeInsertionDeletionTests.InsertManyNodesTest
). This is rather slow (insertingNODE_GROUP_SIZE
values took 75 seconds), so I tried to speed it up by doing it all as one transaction rather than one automatic transaction for each insert, but that resulted in a hang, or at least it was taking much longer than 75 seconds, even for smaller sizes like the 4096 values I ended up using (which still takes a few seconds).Note that this was reproducible on the master branch prior to my bitpacking changes in #2004.
In addition, inserting values in small chunks would cause a
Buffer manager exception: Failed to claim a frame
error. This would occur even with just one value (I also tested on chunks of 2, 5 and 10) inserted between callingConnection::beginWriteTransaction
andConnection::commit
(after many inserts at least, the other test inNodeInsertionDeletionTests
proves that it at least works the first few times).Tests that reproduce both of these issues can be found in this commit.