pystorm / streamparse

Run Python in Apache Storm topologies. Pythonic API, CLI tooling, and a topology DSL.
http://streamparse.readthedocs.io/
Apache License 2.0
1.5k stars 218 forks source link

pystorm.serializers.serializer - ERROR without fail #404

Closed fedelemantuano closed 6 years ago

fedelemantuano commented 7 years ago

I have a strange issue. The serializer fails to decode tuple but the code doesn't call fail method, so I can't get the error input. I'm using Apache Storm 1.1.0 and streamparse 3.11. This is the error:

2017-11-05 14:02:33,214 - pystorm.serializers.serializer - ERROR - Failed to send message: {u'need_task_ids': False, u'command': u'emit', u'anchors': [u'7272527535659655945'], u'stream': u'mail', u'tuple': [u'7557d04d45bc4bd0ec061c86d049895d9237b516138449c182c265ae2c721843_7432613559', ...
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/pystorm/serializers/serializer.py", line 33, in send_message
    self.output_stream.write(self.serialize_dict(msg_dict))
  File "/usr/local/lib/python2.7/dist-packages/pystorm/serializers/json_serializer.py", line 93, in serialize_dict
    serialized = json.dumps(msg_dict, namedtuple_as_object=False)
  File "/usr/local/lib/python2.7/dist-packages/simplejson/__init__.py", line 397, in dumps
    **kw).encode(obj)
  File "/usr/local/lib/python2.7/dist-packages/simplejson/encoder.py", line 291, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/local/lib/python2.7/dist-packages/simplejson/encoder.py", line 373, in iterencode
    return _iterencode(o, 0)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xbe in position 1: invalid start byte

Bye

fedelemantuano commented 6 years ago

I'm closing this issue because I found a library in my tool that makes a os.kill. This command kills the bolt Python process that can't fail the tuple. In production Apache Storm restarts the cluster and that tuple enters in topology again. This loop blocks everything.