scrapinghub / frontera

A scalable frontier for web crawlers
BSD 3-Clause "New" or "Revised" License
1.29k stars 215 forks source link

db worker and strategic worker crashed unexpectedly #373

Closed ghost closed 5 years ago

ghost commented 5 years ago

I start the thrift server with this command hbase-daemon.sh start thrift -c -nonblocking

worker config

HBASE_USE_FRAMED_COMPACT = True

dbw.py

ERROR:db-worker.batchgen:Exception in the main loop
Traceback (most recent call last):
  File "/opt/anaconda3/lib/python3.7/site-packages/frontera/worker/components/__init__.py", line 78, in loop
    is_backoff_needed = self.run()
  File "/opt/anaconda3/lib/python3.7/site-packages/frontera/worker/components/batch_generator.py", line 61, in run
    for partition_id in partitions)
  File "/opt/anaconda3/lib/python3.7/site-packages/frontera/worker/components/batch_generator.py", line 61, in <genexpr>
    for partition_id in partitions)
  File "/opt/anaconda3/lib/python3.7/site-packages/frontera/worker/components/batch_generator.py", line 73, in _handle_partition
    partitions=[partition_id]):
  File "/opt/anaconda3/lib/python3.7/site-packages/frontera/contrib/backends/hbase/__init__.py", line 596, in get_next_requests
    max_requests_per_host=self._max_requests_per_host)
  File "/opt/anaconda3/lib/python3.7/site-packages/frontera/contrib/backends/hbase/__init__.py", line 238, in get_next_requests
    for rk, data in scan_gen:
  File "/opt/anaconda3/lib/python3.7/site-packages/happybase/table.py", line 402, in scan
    self.name, scan, {})
  File "/opt/anaconda3/lib/python3.7/site-packages/thriftpy2/thrift.py", line 200, in _req
    self._send(_api, **kwargs)
  File "/opt/anaconda3/lib/python3.7/site-packages/thriftpy2/thrift.py", line 212, in _send
    self._oprot.trans.flush()
  File "thriftpy2/transport/framed/cyframed.pyx", line 107, in thriftpy2.transport.framed.cyframed.TCyFramedTransport.flush
  File "thriftpy2/transport/framed/cyframed.pyx", line 95, in thriftpy2.transport.framed.cyframed.TCyFramedTransport.c_flush
  File "/opt/anaconda3/lib/python3.7/site-packages/thriftpy2/transport/socket.py", line 136, in write
    self.sock.sendall(buff)
BrokenPipeError: [Errno 32] Broken pipe

sw.py

[strategy-worker] [Errno 32] Broken pipe
NoneType: None
ERROR:strategy-worker:[Errno 32] Broken pipe
NoneType: None
[strategy-worker]   File "/home/hduser/.local/lib/python3.7/site-packages/twisted/internet/defer.py", line 151, in maybeDeferred
    result = f(*args, **kw)
  File "/opt/anaconda3/lib/python3.7/site-packages/frontera/worker/strategy.py", line 192, in work
    self.workflow.process()
  File "/opt/anaconda3/lib/python3.7/site-packages/frontera/worker/strategy.py", line 47, in process
    self.states_context.fetch()
  File "/opt/anaconda3/lib/python3.7/site-packages/frontera/core/manager.py", line 815, in fetch
    self.states.fetch(self._fingerprints)
  File "/opt/anaconda3/lib/python3.7/site-packages/frontera/contrib/backends/hbase/__init__.py", line 356, in fetch
    records = table.rows(keys, columns=[b's:state'])
  File "/opt/anaconda3/lib/python3.7/site-packages/happybase/table.py", line 162, in rows
    self.name, rows, columns, {})
  File "/opt/anaconda3/lib/python3.7/site-packages/thriftpy2/thrift.py", line 200, in _req
    self._send(_api, **kwargs)
  File "/opt/anaconda3/lib/python3.7/site-packages/thriftpy2/thrift.py", line 212, in _send
    self._oprot.trans.flush()
  File "thriftpy2/transport/framed/cyframed.pyx", line 107, in thriftpy2.transport.framed.cyframed.TCyFramedTransport.flush
    self.c_flush()
  File "thriftpy2/transport/framed/cyframed.pyx", line 95, in thriftpy2.transport.framed.cyframed.TCyFramedTransport.c_flush
    self.trans.write(size_str[:4] + data)
  File "/opt/anaconda3/lib/python3.7/site-packages/thriftpy2/transport/socket.py", line 136, in write
    self.sock.sendall(buff)

A screen recording : https://www.dropbox.com/s/xd61st39e60buo1/final_5d03df6c62558100144de067_986170.mp4?dl=0

The sw.py is not crashed in the screen recording, but it does sometime.

https://github.com/scrapinghub/frontera/issues/266

my hbase config

<property>
<name>hbase.regionserver.thrift.framed</name>
<value>true</value>
<source>hbase-site.xml</source>
</property>

<property>
<name>hbase.regionserver.thrift.compact</name>
<value>true</value>
<source>hbase-site.xml</source>
</property>

Second problem This error thriftpy2.thrift.TApplicationException: Missing result will be occur sometime

[db-worker.batchgen] Exception in the main loop
Traceback (most recent call last):
  File "/opt/anaconda3/lib/python3.7/site-packages/frontera/worker/components/__init__.py", line 78, in loop
    is_backoff_needed = self.run()
  File "/opt/anaconda3/lib/python3.7/site-packages/frontera/worker/components/batch_generator.py", line 61, in run
    for partition_id in partitions)
  File "/opt/anaconda3/lib/python3.7/site-packages/frontera/worker/components/batch_generator.py", line 61, in <genexpr>
    for partition_id in partitions)
  File "/opt/anaconda3/lib/python3.7/site-packages/frontera/worker/components/batch_generator.py", line 73, in _handle_partition
    partitions=[partition_id]):
  File "/opt/anaconda3/lib/python3.7/site-packages/frontera/contrib/backends/hbase/__init__.py", line 596, in get_next_requests
    max_requests_per_host=self._max_requests_per_host)
  File "/opt/anaconda3/lib/python3.7/site-packages/frontera/contrib/backends/hbase/__init__.py", line 238, in get_next_requests
    for rk, data in scan_gen:
  File "/opt/anaconda3/lib/python3.7/site-packages/happybase/table.py", line 402, in scan
    self.name, scan, {})
  File "/opt/anaconda3/lib/python3.7/site-packages/thriftpy2/thrift.py", line 203, in _req
    return self._recv(_api)
  File "/opt/anaconda3/lib/python3.7/site-packages/thriftpy2/thrift.py", line 239, in _recv
    raise TApplicationException(TApplicationException.MISSING_RESULT)
thriftpy2.thrift.TApplicationException: Missing result
ERROR:db-worker.batchgen:Exception in the main loop
Traceback (most recent call last):
  File "/opt/anaconda3/lib/python3.7/site-packages/frontera/worker/components/__init__.py", line 78, in loop
    is_backoff_needed = self.run()
  File "/opt/anaconda3/lib/python3.7/site-packages/frontera/worker/components/batch_generator.py", line 61, in run
    for partition_id in partitions)
  File "/opt/anaconda3/lib/python3.7/site-packages/frontera/worker/components/batch_generator.py", line 61, in <genexpr>
    for partition_id in partitions)
  File "/opt/anaconda3/lib/python3.7/site-packages/frontera/worker/components/batch_generator.py", line 73, in _handle_partition
    partitions=[partition_id]):
  File "/opt/anaconda3/lib/python3.7/site-packages/frontera/contrib/backends/hbase/__init__.py", line 596, in get_next_requests
    max_requests_per_host=self._max_requests_per_host)
  File "/opt/anaconda3/lib/python3.7/site-packages/frontera/contrib/backends/hbase/__init__.py", line 238, in get_next_requests
    for rk, data in scan_gen:
  File "/opt/anaconda3/lib/python3.7/site-packages/happybase/table.py", line 402, in scan
    self.name, scan, {})
  File "/opt/anaconda3/lib/python3.7/site-packages/thriftpy2/thrift.py", line 203, in _req
    return self._recv(_api)
  File "/opt/anaconda3/lib/python3.7/site-packages/thriftpy2/thrift.py", line 239, in _recv
    raise TApplicationException(TApplicationException.MISSING_RESULT)
thriftpy2.thrift.TApplicationException: Missing result

Screenshot from 2019-06-29 20-53-49

sibiryakov commented 5 years ago

There are some problems in Thrift py when used with Python 3.7, I would suggest to rollback to Python 3.6 until there is a solution for this.

ghost commented 5 years ago

@sibiryakov I found a solution for this problem, using the command hbase thrift start -c -nonblocking instead of hbase-daemon.sh start thrift -c -nonblocking to start the Thrift server.