Closed mbutrovich closed 6 years ago
It's looking like this issue started manifesting after #1415 was merged 2 days ago. That commit is effectively current master, and if I go back to the commit before #1415 I can't get it to reproduce.
I don't fully understand how #1415 would introduce this since there were no changes to the GC system, but it's ~5000 lines of changes that I didn't review. It's possible that's now exposing incorrect behavior elsewhere in the system, but I have no root cause yet.
A workaround for the double-free is fairly easy to implement in ephemeral_pool.h, but I'd rather understand why ResetTuple is getting called on the same tuple multiple times since that sounds very wrong. Unfortunately, I don't have a succinct test to reproduce it yet and have to resort to running TPC-C.
@ksaito7
I check it.
I found this cause. Even if a query updates data in a column with no constraint, my code updates it as if its column has primary key constraint. I didn't figure out why GC manager made that error in this situation actually. But when I fixed that, then that error disappeared.
Wow, nice job finding it so quickly!
If you open up a new PR for your refactor I’ll be sure to run it through the benchmarks as a sanity check.
@mbutrovich Thanks, but I've asked @lmwnshn for the test.
I've run it and it looks good, more details in the PR.
@lmwnshn Thanks a lot!
I've been working on performance stuff today, and the faster I get the system to run the less stable it is. So I decided to chase down the segfaults in TPC-C. Here are the results of one segfault hunt.
It's a double-free in GCManager's CheckAndReclaimVarlenColumns(). I've not looked at this code closely before as part of our other GC fixes, but will dive into it in the next couple of days. AddressSanitizer log is attached. asan.txt config here: tpcc_config.txt