Closed EchoShoot closed 12 months ago
Hi @EchoShoot could you provide a stack trace of exception?
I'm seeing the same issue:
INFO:strategy-worker:Seeds addition started from url file:///Users/andy/Documents/workspace/huntsman/huntsman/seeds.txt
Traceback (most recent call last):
File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/Users/andy/Documents/workspace/huntsman/vev/lib/python3.6/site-packages/frontera/worker/strategy.py", line 391, in <module>
worker.run(seeds_url)
File "/Users/andy/Documents/workspace/huntsman/vev/lib/python3.6/site-packages/frontera/worker/stats.py", line 45, in run
super(StatsExportMixin, self).run(*args, **kwargs)
File "/Users/andy/Documents/workspace/huntsman/vev/lib/python3.6/site-packages/frontera/worker/strategy.py", line 258, in run
self.add_seeds(seeds_url)
File "/Users/andy/Documents/workspace/huntsman/vev/lib/python3.6/site-packages/frontera/worker/strategy.py", line 224, in add_seeds
strategy.read_seeds(fh)
File "/Users/andy/Documents/workspace/huntsman/vev/lib/python3.6/site-packages/frontera/strategy/basic.py", line 10, in read_seeds
self.schedule(r)
File "/Users/andy/Documents/workspace/huntsman/vev/lib/python3.6/site-packages/frontera/strategy/__init__.py", line 122, in schedule
self._scheduled_stream.send(request, score, dont_queue)
File "/Users/andy/Documents/workspace/huntsman/vev/lib/python3.6/site-packages/frontera/core/manager.py", line 790, in send
self._producer.send(None, encoded)
File "/Users/andy/Documents/workspace/huntsman/vev/lib/python3.6/site-packages/frontera/contrib/messagebus/kafkabus.py", line 103, in send
self._producer.send(self._topic, value=msg)
File "/Users/andy/Documents/workspace/huntsman/vev/lib/python3.6/site-packages/kafka/producer/kafka.py", line 552, in send
assert type(value_bytes) in (bytes, bytearray, memoryview, type(None))
AssertionError
The Kafka library appears to be expecting bytes
but the JSON codec emits a str
.
It is an version issue,for python 3 you can use encode to utf-8
for example -- result = producer.send('topic-sentiments', objString.encode('utf-8'))
So I got my sample code to work using the encode('utf-8') option as suggested by @amitsing89 but is this problem ever going to be fixed within the package itself at some point? Or has Python 3 introduced some change/feature (I believe with regards to how strings are handled by default) that this package (and all others that use it, for example: kafka-python) will force users to perform the utf-encoding on ever single message sent to Kafka from an application?
2017-10-07 20:33:46 [kafka.coordinator] INFO: Discovered coordinator 0 for group fetchers-spider-feed 2017-10-07 20:33:51 [messagebus-backend] INFO: Consuming from partition id 0 2017-10-07 20:33:51 [manager] INFO: Frontier Manager Started! 2017-10-07 20:33:51 [manager] INFO: -------------------------------------------------------------------------------- 2017-10-07 20:33:51 [frontera.contrib.scrapy.schedulers.FronteraScheduler] INFO: Starting frontier 2017-10-07 20:33:51 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min) Jumping into debugger for post-mortem of exception 'value must be bytes':
problem happened when i set (MESSAGE_BUS_CODEC = 'frontera.contrib.backends.remote.codecs.json')