linearregression / hypertable

Automatically exported from code.google.com/p/hypertable
GNU General Public License v2.0
0 stars 0 forks source link

Support long-lived scanners #158

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
According to Josh, he ran into a situation where he was joining a massive
table with a very small table and the scanner on the small table was idle
for 30 minutes.

We should probably support re-startable scanners.

Original issue reported on code.google.com by nuggetwh...@gmail.com on 4 Sep 2008 at 3:58

GoogleCodeExporter commented 9 years ago
One problem with this is that as soon as the scanner releases the CellCache, it 
loses
snapshot isolation.  Someone could come along delete a bunch of cells and then 
when
the scanner later starts up, it won't see those cells that were deleted.

Maybe there could be a switch passed into create_scanner indicating if it can be
restarted.

Original comment by nuggetwh...@gmail.com on 4 Sep 2008 at 4:14

GoogleCodeExporter commented 9 years ago
If the scanner is flagged "restartable", then it should also probably return
whole-rows only.  That way at least row-level snapshot isolation is achieved.

Original comment by nuggetwh...@gmail.com on 4 Sep 2008 at 4:45

GoogleCodeExporter commented 9 years ago
In our application, we need too filter out the unique key of the table which is 
as 
same as this issue. Now I change the ttl of the scanner to solve this temporary.
But, I find there is a lot of invalid scanner id in the ThriftBroker.log.

......
 scanner=17038656 scanner=140583406852832 scanner=140583407497712 scanner=19379648 
scanner=18189456 scanner=140583407320944 scanner=17962944 scanner=19497312 
scanner=19985696 scanner=140583406776560 scanner=140583406862624 
scanner=18189184 
scanner=18195952 scanner=18189984 scanner=19444656 scanner=18391056 
scanner=18390608 
scanner=19923376 scanner=140583407320288 scanner=18191520 scanner=19988496 
scanner=19448224 scanner=140583408402528 scanner=20445920 scanner=19447968 
scanner=19444096 scanner=18193376 scanner=17651424 scanner=17298272 
scanner=17145664 
scanner=18392256 scanner=17504512 scanner=20445568 scanner=17963936 
scanner=20313568 
scanner=18192816 scanner=18191648 scanner=21270032 scanner=140583406862144 
scanner=17130432 scanner=17167200 scanner=140583410592816 scanner=17160720 
scanner=21270336 scanner=20251648 scanner=17130560 scanner=21267216 
scanner=17129872 
scanner=140583408603728 scanner=17130304 scanner=140583407499904 
scanner=140583410606768 scanner=17171104 scanner=17126608 
scanner=212677441267512901 
ERROR ThriftBroker : 
(/home/zhanghongxun/download/src/hypertable/src/cc/Hypertable/Lib/IntervalScanne
r.cc:
204) fetch scanblock : RANGE SERVER invalid scanner id - scanner ID 6
1267512901 ERROR ThriftBroker : next_cells_as_arrays 
(/home/zhanghongxun/download/src/hypertable/src/cc/ThriftBroker/ThriftBroker.cc:
352): 
virtual void 
Hypertable::ThriftBroker::ServerHandler::next_cells_as_arrays(Hypertable::Thrift
Broke
r::ThriftCellsAsArrays&, Hypertable::ThriftGen::Scanner): 
Hypertable::Exception:  - 
RANGE SERVER invalid scanner id.

Btw, there is a lot of function call such as get_cells_as_array().

Original comment by zhxg...@gmail.com on 8 Mar 2010 at 5:30

GoogleCodeExporter commented 9 years ago
One way to solve this might be by allowing applications to specify "scanner 
groups"
so that all scanners in a group expire at the same time. Each scanner in the 
group
can be active but marked for expiry, but is only purged when all scanners in its
group have been marked for expiry.

-Sanjit

Original comment by sjha...@gmail.com on 11 Mar 2010 at 6:37

GoogleCodeExporter commented 9 years ago

Original comment by nuggetwh...@gmail.com on 11 Apr 2010 at 3:57

GoogleCodeExporter commented 9 years ago

Original comment by nuggetwh...@gmail.com on 14 Jan 2012 at 8:32