lwhay / asterixdb

Automatically exported from code.google.com/p/asterixdb
0 stars 0 forks source link

LSN of LSM components is overwritten #776

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
If you look in the LIFOMetaDataFrame.initBuffer(), you will see that we 
erroneously set the free space offset as follow: 
buf.putInt(freeSpaceOff, lsnOff + 4);

Instead it should have been:
buf.putInt(freeSpaceOff, lsnOff + 8);

The LSN is of type long, and thus the free space pointer should be 8 bytes 
after the lsn offset.

I fixed this in my local branch, but this now makes me wonder why current test 
cases didn't catch this bug?
We need at least a test case that covers this case.

Original issue reported on code.google.com by salsuba...@gmail.com on 21 May 2014 at 10:36

GoogleCodeExporter commented 9 years ago

Original comment by drese...@uci.edu on 17 Oct 2014 at 7:06

GoogleCodeExporter commented 9 years ago
Writing a test case for this makes only limited sense for me. What can we test 
that is different from how the offsets are defined in the code? If the test 
only duplicates the offset definitions and their accesses, it won't catch 
anything.

I tried some different ways to give semantics to different buffer positions so 
that you can call get(PAGELSN) instead of getInt(lsnOff) which can result in 
errors if e.g. LSN is not an int. That in itself already required some 
reflection. Making it work for all frames in a way that prevents code 
duplication (of that meta part) and allows for easy usage is something for 
either AOP (Abstract-Oriented Programming) or compile-time code manipulation 
(JSR-269). Both of these overshoot the problem by miles.

Unless anyone has an idea of how to catch offset miscalculations, I vote to 
close this as wontfix.

Original comment by mdrese...@googlemail.com on 6 Nov 2014 at 12:41

GoogleCodeExporter commented 9 years ago
After discussing this with Young-Seok, we found that this is not so much about 
making sure that frame offsets are calculated correctly, but more about testing 
the components that actually use the LSN. Apparently, no test caught that the 
LSN got corrupted. Thus, we need deterministic tests for the recovery of the 
components.

Original comment by mdrese...@googlemail.com on 6 Nov 2014 at 1:02