Add support for mixing steal/force + lsn-free pages

Currently, steal/force logging (which skips redos) of LSN-free pages will
lead to corruption at recovery.

The issue is that the force-write can update a reused page.  This page may
have been written to in the past by other log entries, which REDO might
replay (since the page is LSN-free).  Since the force-write does not log
the REDOs, replaying old entries corrupts the only copy of the data on disk.

The solution is to have the analysis phase detect this, and mark some pages
"known clean".  In ARIES, this would be an optimization; with lsn-free +
force it is needed for correctness.  Note that the list of known-clean
pages can be unbounded in size.  Two mechanisms solve the problem:

 - first, we can use a range tree to store the known clean pages; this
means that each lsn-free force region will use at most a few bytes of
memory at recovery.
 - when the range tree exhausts RAM, we can evict the highest range in the
tree.  Redo will then skip all entries above the eviction point.  We then
mark the range REDO covered known-clean, and repeat analysis + redo until
the entire log is recovered.  This adds log passes, but may actually
improve redo performance, as it acts as a blocked nested loop join

Also, Stasis' redo phase is currently not parallelizable.  Fixing this bug
will add support for concurrent redo.

Original issue reported on code.google.com by sears.ru...@gmail.com on 17 Jun 2009 at 5:37

Zhoutall / stasis

Add support for mixing steal/force + lsn-free pages #11