linearregression / hypertable

Automatically exported from code.google.com/p/hypertable
GNU General Public License v2.0
0 stars 0 forks source link

RangeServer crash during compaction #646

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Fixed the merging algorithm and pushed out to Ayima.  After a few hours, the 
RangeServer crashed with the following stack trace:

Thread 1 (process 30232):
#0  0x000000000050244c in Hypertable::AccessGroup::unstage_compaction ()
#1  0x0000000000599b63 in Hypertable::Range::compact ()
#2  0x00000000004e950e in Hypertable::MaintenanceQueue::Worker::operator() ()
#3  0x00007f9504ff1545 in thread_proxy () from 
/opt/hypertable/current/lib/libboost_thread.so.1.44.0
#4  0x00007f95040eefc7 in start_thread () from /lib/libpthread.so.0
#5  0x00007f950322664d in clone () from /lib/libc.so.6
#6  0x0000000000000000 in ?? ()

The only other active thread was this one:

Thread 56 (process 30233):
#0  0x0000000000588e5b in 
std::__push_heap<__gnu_cxx::__normal_iterator<Hypertable::MergeScanner::ScannerS
tate*, std::vector<Hypertable::MergeScanner::ScannerState, std::allo\
cator<Hypertable::MergeScanner::ScannerState> > >, long, 
Hypertable::MergeScanner::ScannerState, 
Hypertable::MergeScanner::LtScannerState> ()
#1  0x000000000058915f in 
std::__adjust_heap<__gnu_cxx::__normal_iterator<Hypertable::MergeScanner::Scanne
rState*, std::vector<Hypertable::MergeScanner::ScannerState, std::al\
locator<Hypertable::MergeScanner::ScannerState> > >, long, 
Hypertable::MergeScanner::ScannerState, 
Hypertable::MergeScanner::LtScannerState> ()
#2  0x0000000000584259 in Hypertable::MergeScanner::forward ()
#3  0x0000000000507e62 in Hypertable::AccessGroup::run_compaction ()
#4  0x0000000000599aba in Hypertable::Range::compact ()
#5  0x00000000004e950e in Hypertable::MaintenanceQueue::Worker::operator() ()
#6  0x00007f9504ff1545 in thread_proxy () from 
/opt/hypertable/current/lib/libboost_thread.so.1.44.0
#7  0x00007f95040eefc7 in start_thread () from /lib/libpthread.so.0
#8  0x00007f950322664d in clone () from /lib/libc.so.6
#9  0x0000000000000000 in ?? ()

Original issue reported on code.google.com by nuggetwh...@gmail.com on 12 Jul 2011 at 10:54

GoogleCodeExporter commented 9 years ago
Tail of RangeServer log:

1310510094 INFO Hypertable.RangeServer : 
(/root/src/hypertable/src/cc/Hypertable/RangeServer/AccessGroup.cc:557) 
Starting Merging Compaction of 
1/18[pl.digart.wolis/zoom/2779310/daydream.html:http..pl.digart.zabaniak13/zoom/
913339/PROCESJA.html:http](linkinfo)
1310510098 INFO Hypertable.RangeServer : 
(/root/src/hypertable/src/cc/Hypertable/RangeServer/RangeServer.cc:2547) 
Entering get_statistics()
1310510104 INFO Hypertable.RangeServer : 
(/root/src/hypertable/src/cc/Hypertable/RangeServer/RSStats.h:84) Maintenance 
stats scans=(0 0 0 0.000000) updates=(0 0 0 0.000000 0)
1310510114 INFO Hypertable.RangeServer : 
(/root/src/hypertable/src/cc/Hypertable/RangeServer/RangeServer.cc:2775) 
Exiting get_statistics()
1310510114 INFO Hypertable.RangeServer : 
(/root/src/hypertable/src/cc/Hypertable/RangeServer/MaintenanceScheduler.cc:240)
 Memory Statistics (MB): VM=4465.83, RSS=3002.91, tracked=3075.94, 
computed=3042.05 limit=4761.60
1310510114 INFO Hypertable.RangeServer : 
(/root/src/hypertable/src/cc/Hypertable/RangeServer/MaintenanceScheduler.cc:245)
 Memory Allocation: BlockCache=0.01% BlockIndex=0.48% BloomFilter=2.89% 
CellCache=95.05% ShadowCache=0.00% QueryCache=1.57%
1310510114 INFO Hypertable.RangeServer : 
(/root/src/hypertable/src/cc/Hypertable/RangeServer/RangeServer.cc:3263) Memory 
Usage: 3225609989 bytes
1310510114 ERROR Hypertable.RangeServer : run_compaction 
(/root/src/hypertable/src/cc/Hypertable/RangeServer/AccessGroup.cc:784): 
1/18[pl.digart.wolis/zoom/2779310/daydream.html:http..pl.digart.zabaniak13/zoom/
913339/PROCESJA.html:http](linkinfo) Hypertable::Exception: Problem writing to 
DFS file '/hypertable/tables/1/18/linkinfo/C3VtmizkFN9dzTvZ/cs201' : 
java.io.IOException: Bad connect ack with firstBadLink as 10.6.0.108:50010 - 
DFS BROKER i/o error
    at virtual void Hypertable::CellStoreV5::add(const Hypertable::Key&, Hypertable::ByteString) (/root/src/hypertable/src/cc/Hypertable/RangeServer/CellStoreV5.cc:413)
1310510114 ERROR Hypertable.RangeServer : ~CellStoreV5 
(/root/src/hypertable/src/cc/Hypertable/RangeServer/CellStoreV5.cc:83): 
Hypertable::Exception: Error closing DFS fd: 363239 - DFS BROKER i/o error
    at virtual void Hypertable::DfsBroker::Client::close(int32_t) (/root/src/hypertable/src/cc/DfsBroker/Lib/Client.cc:232)
    at virtual void Hypertable::DfsBroker::Client::close(int32_t) (/root/src/hypertable/src/cc/DfsBroker/Lib/Client.cc:229): java.io.IOException: Bad connect ack with firstBadLink as 10.6.0.108:50010

Tail of DfsBroker log:

INFO: Opening file '/hypertable/tables/1/18/linkinfo/C3VtmizkFN9dzTvZ/cs197' 
flags=1 bs=0 handle = 363161
11/07/12 23:35:09 INFO hdfs.DFSClient: Failed to connect to /10.6.0.107:50010, 
add to deadNodes and continue
java.io.IOException: Got error in response to OP_READ_BLOCK 
self=/10.6.0.107:39961, remote=/10.6.0.107:50010 for file 
/hypertable/tables/1/18/linkinfo/C3VtmizkFN9dzTvZ/cs197 for block 
17671863155901731_8899996
    at org.apache.hadoop.hdfs.DFSClient$BlockReader.newBlockReader(DFSClient.java:1487)
    at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1811)
    at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1948)
    at java.io.DataInputStream.read(DataInputStream.java:132)
    at org.hypertable.DfsBroker.hadoop.HdfsBroker.Read(HdfsBroker.java:354)
    at org.hypertable.DfsBroker.hadoop.RequestHandlerRead.run(RequestHandlerRead.java:55)
    at org.hypertable.AsyncComm.ApplicationQueue$Worker.run(ApplicationQueue.java:98)
    at java.lang.Thread.run(Thread.java:662)
12-Jul-2011 23:35:09 org.hypertable.DfsBroker.hadoop.HdfsBroker Open
INFO: Opening file '/hypertable/tables/1/18/default/bJAFDae9lcWi16FI/cs230' 
flags=1 bs=0 handle = 363162
11/07/12 23:35:09 INFO hdfs.DFSClient: Failed to connect to /10.6.0.107:50010, 
add to deadNodes and continue
java.io.IOException: Got error in response to OP_READ_BLOCK 
self=/10.6.0.107:39964, remote=/10.6.0.107:50010 for file 
/hypertable/tables/1/18/default/bJAFDae9lcWi16FI/cs230 for block 
8480882358091361197_8714168
    at org.apache.hadoop.hdfs.DFSClient$BlockReader.newBlockReader(DFSClient.java:1487)
    at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1811)
    at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1948)
    at java.io.DataInputStream.read(DataInputStream.java:132)
    at org.hypertable.DfsBroker.hadoop.HdfsBroker.Read(HdfsBroker.java:354)
    at org.hypertable.DfsBroker.hadoop.RequestHandlerRead.run(RequestHandlerRead.java:55)
    at org.hypertable.AsyncComm.ApplicationQueue$Worker.run(ApplicationQueue.java:98)
    at java.lang.Thread.run(Thread.java:662)
12-Jul-2011 23:35:10 org.hypertable.DfsBroker.hadoop.HdfsBroker Open
INFO: Opening file '/hypertable/tables/1/18/linkinfo/C3VtmizkFN9dzTvZ/cs191' 
flags=1 bs=0 handle = 363163
11/07/12 23:35:10 INFO hdfs.DFSClient: Failed to connect to /10.6.0.107:50010, 
add to deadNodes and continue
java.io.IOException: Got error in response to OP_READ_BLOCK 
self=/10.6.0.107:39967, remote=/10.6.0.107:50010 for file 
/hypertable/tables/1/18/linkinfo/C3VtmizkFN9dzTvZ/cs191 for block 
5520416025198979193_8893526
    at org.apache.hadoop.hdfs.DFSClient$BlockReader.newBlockReader(DFSClient.java:1487)
    at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1811)
    at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1948)
    at java.io.DataInputStream.read(DataInputStream.java:132)
    at org.hypertable.DfsBroker.hadoop.HdfsBroker.Read(HdfsBroker.java:354)
    at org.hypertable.DfsBroker.hadoop.RequestHandlerRead.run(RequestHandlerRead.java:55)
    at org.hypertable.AsyncComm.ApplicationQueue$Worker.run(ApplicationQueue.java:98)
    at java.lang.Thread.run(Thread.java:662)
12-Jul-2011 23:35:10 org.hypertable.DfsBroker.hadoop.HdfsBroker Open
INFO: Opening file '/hypertable/tables/1/18/default/bJAFDae9lcWi16FI/cs349' 
flags=1 bs=0 handle = 363164
12-Jul-2011 23:35:10 org.hypertable.DfsBroker.hadoop.HdfsBroker Open
INFO: Opening file '/hypertable/tables/1/18/linkinfo/C3VtmizkFN9dzTvZ/cs69' 
flags=1 bs=0 handle = 363165
11/07/12 23:35:10 WARN hdfs.DFSClient: Failed to connect to /10.6.0.107:50010 
for file /hypertable/tables/1/18/default/bJAFDae9lcWi16FI/cs282 for block 
-373761920656548797:java.io.IOException: Premeture EOF from inputStream
    at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:104)
    at org.apache.hadoop.hdfs.DFSClient$BlockReader.readChunk(DFSClient.java:1389)
    at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:237)
    at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:189)
    at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:158)
    at org.apache.hadoop.hdfs.DFSClient$BlockReader.read(DFSClient.java:1249)
    at org.apache.hadoop.fs.FSInputChecker.readFully(FSInputChecker.java:384)
    at org.apache.hadoop.hdfs.DFSClient$BlockReader.readAll(DFSClient.java:1522)
    at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.fetchBlockByteRange(DFSClient.java:2047)
    at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2116)
    at org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:46)
    at org.hypertable.DfsBroker.hadoop.HdfsBroker.PositionRead(HdfsBroker.java:450)
    at org.hypertable.DfsBroker.hadoop.RequestHandlerPositionRead.run(RequestHandlerPositionRead.java:60)
    at org.hypertable.AsyncComm.ApplicationQueue$Worker.run(ApplicationQueue.java:98)
    at java.lang.Thread.run(Thread.java:662)

12-Jul-2011 23:35:10 org.hypertable.DfsBroker.hadoop.HdfsBroker Open
INFO: Opening file '/hypertable/tables/1/18/default/bJAFDae9lcWi16FI/cs282' 
flags=1 bs=0 handle = 363166
11/07/12 23:35:10 WARN hdfs.DFSClient: Exception while reading from 
blk_-373761920656548797_8767713 of 
/hypertable/tables/1/18/default/bJAFDae9lcWi16FI/cs282 from 10.6.0.107:50010: 
java.io.IOException: Premeture EOF from inputStream
    at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:104)
    at org.apache.hadoop.hdfs.DFSClient$BlockReader.readChunk(DFSClient.java:1389)
    at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:237)
    at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:189)
    at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:158)
    at org.apache.hadoop.hdfs.DFSClient$BlockReader.read(DFSClient.java:1249)
    at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.readBuffer(DFSClient.java:1899)
    at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1951)
    at java.io.DataInputStream.read(DataInputStream.java:132)
    at org.hypertable.DfsBroker.hadoop.HdfsBroker.Read(HdfsBroker.java:354)
    at org.hypertable.DfsBroker.hadoop.RequestHandlerRead.run(RequestHandlerRead.java:55)
    at org.hypertable.AsyncComm.ApplicationQueue$Worker.run(ApplicationQueue.java:98)
    at java.lang.Thread.run(Thread.java:662)

12-Jul-2011 23:35:10 org.hypertable.DfsBroker.hadoop.HdfsBroker Open
INFO: Opening file '/hypertable/tables/1/18/linkinfo/C3VtmizkFN9dzTvZ/cs23' 
flags=1 bs=0 handle = 363167
11/07/12 23:35:10 WARN hdfs.DFSClient: Failed to connect to /10.6.0.107:50010 
for file /hypertable/tables/1/18/default/bJAFDae9lcWi16FI/cs301 for block 
-317810264165968674:java.io.IOException: Premeture EOF from inputStream
    at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:104)
    at org.apache.hadoop.hdfs.DFSClient$BlockReader.readChunk(DFSClient.java:1389)
    at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:237)
    at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:189)
    at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:158)
    at org.apache.hadoop.hdfs.DFSClient$BlockReader.read(DFSClient.java:1249)
    at org.apache.hadoop.fs.FSInputChecker.readFully(FSInputChecker.java:384)
    at org.apache.hadoop.hdfs.DFSClient$BlockReader.readAll(DFSClient.java:1522)
    at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.fetchBlockByteRange(DFSClient.java:2047)
    at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2116)
    at org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:46)
    at org.hypertable.DfsBroker.hadoop.HdfsBroker.PositionRead(HdfsBroker.java:450)
    at org.hypertable.DfsBroker.hadoop.RequestHandlerPositionRead.run(RequestHandlerPositionRead.java:60)
    at org.hypertable.AsyncComm.ApplicationQueue$Worker.run(ApplicationQueue.java:98)
    at java.lang.Thread.run(Thread.java:662)

[...]

Original comment by nuggetwh...@gmail.com on 12 Jul 2011 at 11:04

GoogleCodeExporter commented 9 years ago

Original comment by nuggetwh...@gmail.com on 14 Jan 2012 at 8:33