scalar-labs / scalar-jepsen

Jepsen tests for ScalarDB and ScalarDL
45 stars 6 forks source link

Change timing to close transaction objects when reconnecting to cluster #76

Closed brfrn169 closed 2 years ago

brfrn169 commented 2 years ago

Currently, we are hitting an issue where NullPointerException is happening during reconnecting to a cluster. The current reconnection logic is closing a transaction object (and setting it to NULL) first and trying to reconnect to a cluster (and creating a new transaction object), and then resetting the new one as a transaction object. But in the crash situation, the reconnection takes some more time, but the clients continue invoking the workload with the NULL transaction object. I think that's why NullPointerException is happening until the reconnection is done.

In this PR, I change the timing to close transaction objects when reconnecting to a cluster. To be specific, I change the timing to right before the resetting. I think we can mitigate the issue with this change. Please take a look!