ilovesoup / hyracks

Automatically exported from code.google.com/p/hyracks
Apache License 2.0
0 stars 0 forks source link

Enhanced page pinning in BufferCache #48

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
I am using the inverted index implementation quite heavily, and I have found 
debugging BufferCache issues quite challenging.  This has lead me to two 
related suggestions for improving BufferCache.

1. Instrument BufferCache to track the number of pinned pages.

2. Create a 'soft pin' and a 'hard pin', such that a soft pinned page can be 
victimized (logging a warning) if no other victim is found.  This may lead to 
thrashing, but may be preferable for 'production jobs'.  A 'hard pin' would 
thus be encouraged only for debugging.

Original issue reported on code.google.com by nbales on 15 Nov 2011 at 11:28

GoogleCodeExporter commented 9 years ago
On point 1:  Absolutely.  In debug mode (if we had such a mode) it would be 
nice to know who's pinning what, even.

On point 2:  I think "soft pin" is an oxymoron - "pin" means "I'm now going to 
be accessing the bits in this page frame as in-memory data" - right?  How could 
a thread tolerate having its memory ripped out from underneath it w/o incurring 
a fatal error...?  I think the right answer here is what you proposed in a 
different issue, namely, if pin fails, due to there not being any more unpinned 
room at the inn, that should lead to a loudly thrown exception (one that should 
not be possible under correct operation, as only a small portion of the large 
buffer pool should be pinned - no transaction should ever pin more than a 
couple of frames, and we should not admit more transactions than what can be 
simultaneously supported with that number of pinned pages).  Clearly there is 
some thread not unpinning things as it should...?

Original comment by dtab...@gmail.com on 15 Nov 2011 at 11:15

GoogleCodeExporter commented 9 years ago
>> "pin" means "I'm now going to be accessing the bits in this page frame as 
in-memory data" - right?

Oh, sorry about the misunderstanding.  I encountered code in the inverted index 
that uses pin somewhat differently -- possibly to improve prefetch behavior -- 
and thus misunderstood its purpose.  Forgot #2. =)

Original comment by nbales on 16 Nov 2011 at 2:46

GoogleCodeExporter commented 9 years ago
There may be some confusion about "eagerly" pinning all pages of an inverted 
list. In many ways, the inverted index is not as efficient as it could be (as 
I'm sure Nathan is painfully aware of :-)). The plan is to finish the 
end-to-end implementations of indexed fuzzy lookups/joins and conjunctive 
queries - and then possibly optimize the inverted index for performance.

Anyway, there actually is DebugBufferCache that does pin/latch counting. It is 
available in the hyracks_dev_next branch or my hyracks_btree_updates_next 
branch.

I'm marking this issue as fixed, since we won't be taking care of #2 and #1 is 
already implemented.

Original comment by alexande...@gmail.com on 16 Nov 2011 at 4:36

GoogleCodeExporter commented 9 years ago
Thank you for pointing out the DebugBufferCache Alex.  This will be very useful 
for my debugging.  

I have already pushed the inverted index beyond a size that performs well, so I 
am already in the process of adding some higher performance features.  I'll 
send you a description of what I'm doing later in the week for comments.  Maybe 
we can make my features generally applicable.  

Original comment by nbales on 16 Nov 2011 at 4:47