Closed aglagla closed 8 years ago
Thanks for reporting. I will have a look as soon as I find the time. Will keep you updated on the process. For a start, could you reduce the buffer_size in the config to 1 and see if you get any output?
No changes with the buffer size to 1 I straced the process as well, with just the connect call :
$ sudo strace -e connect kafka_influxdb -c config.yaml -vvv
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=24219, si_status=0, si_utime=0, si_stime=0} ---
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=24221, si_status=0, si_utime=0, si_stime=0} ---
Traceback (most recent call last):
File "/usr/lib/python2.7/logging/__init__.py", line 851, in emit
msg = self.format(record)
File "/usr/lib/python2.7/logging/__init__.py", line 724, in format
return fmt.format(record)
File "/usr/lib/python2.7/logging/__init__.py", line 464, in format
record.message = record.getMessage()
File "/usr/lib/python2.7/logging/__init__.py", line 328, in getMessage
msg = msg % self.args
TypeError: not all arguments converted during string formatting
Logged from file kafka_influxdb.py, line 100
Traceback (most recent call last):
File "/usr/lib/python2.7/logging/__init__.py", line 851, in emit
msg = self.format(record)
File "/usr/lib/python2.7/logging/__init__.py", line 724, in format
return fmt.format(record)
File "/usr/lib/python2.7/logging/__init__.py", line 464, in format
record.message = record.getMessage()
File "/usr/lib/python2.7/logging/__init__.py", line 328, in getMessage
msg = msg % self.args
TypeError: not all arguments converted during string formatting
Logged from file kafka_influxdb.py, line 118
Traceback (most recent call last):
File "/usr/lib/python2.7/logging/__init__.py", line 851, in emit
msg = self.format(record)
File "/usr/lib/python2.7/logging/__init__.py", line 724, in format
return fmt.format(record)
File "/usr/lib/python2.7/logging/__init__.py", line 464, in format
record.message = record.getMessage()
File "/usr/lib/python2.7/logging/__init__.py", line 328, in getMessage
msg = msg % self.args
TypeError: not all arguments converted during string formatting
Logged from file kafka_influxdb.py, line 130
INFO:root:Connecting to InfluxDB at localhost:8086...
INFO:root:Creating database mydb if not exists
INFO:root:Creating InfluxDB database if not exists: mydb
INFO:urllib3.connectionpool:Starting new HTTP connection (1): localhost
connect(3, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
connect(3, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
connect(3, {sa_family=AF_INET, sin_port=htons(8086), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
DEBUG:urllib3.connectionpool:Setting read timeout to None
DEBUG:urllib3.connectionpool:"GET /query?q=CREATE+DATABASE+mydb&db=mydb HTTP/1.1" 200 72
INFO:root:database already exists
INFO:root:Listening for messages on Kafka topic aleph...
I have the same problem after I ran the setup.py file. @mre Do you have any idea on that?
@panda87 no idea yet. Did it work before for you or was it never working? Also did you try to run it with Docker?
@aglagla what happens when you try to read on a topic that does not exist? Will kafka-influxdb complain?
@mre Its the first time I tried to implement this project so I don't have too much experience on that. No, didn't try to run it with Docker since the docker-compose contains another services which already installed on my system. However, I'll try to run it with Docker
@mre the Docker stuck on Starting to consume messages No message consumed
@panda87 @aglagla could you guys try again now? I think I found the bug:
In the beginning I had a simple handle_read()
method. It would just read messages from Kafka and yield the values to the caller. So far so good.
After some time I discovered a bug where kafka-influxdb was not reconnecting after Kafka crashed. So I added retry handling for that case. The handler was a simple wrapper function around handle_read()
:
def read():
# Reconnect on error
while True:
yield from handle_read()
The yield from
returns all messages from handle_read()
. Unfortunately this only worked in Python 3.4, so I removed it again.
def read():
# Reconnect on error
while True:
handle_read()
What I should have done instead was:
def read():
# Reconnect on error
while True:
for msg in handle_read():
yield msg
So I forgot to yield the messages back to the caller. This is now fixed. Sorry for the inconvenience.
Many thanks for your feedback and sorry for not being more responsive. Traveling at the moment but will definitely be testing the new release in the coming weeks.
Cheers,
Alexis
On Wed, Oct 7, 2015 at 4:08 PM, Matthias Endler notifications@github.com wrote:
@panda87 https://github.com/panda87 @aglagla https://github.com/aglagla could you guys try again now? I think I found the bug:
In the beginning I had a simple handle_read() method. It would just read messages from Kafka and yield the values to the caller. So far so good.
After some time I discovered a bug where kafka-influxdb was not reconnecting after Kafka crashed. So I added retry handling for that case. The handler was a simple wrapper function around handle_read():
def read():
Reconnect on error
while True: yield from handle_read()
The yield from returns all messages from handle_read(). Unfortunately this only worked in Python 3.4, so I removed it again.
def read():
Reconnect on error
while True: handle_read()
What I should have done instead was:
def read():
Reconnect on error
while True: for msg in handle_read(): yield msg
So I forgot to yield the messages back to the caller. This is now fixed. Sorry for the inconvenience.
— Reply to this email directly or view it on GitHub https://github.com/mre/kafka-influxdb/issues/5#issuecomment-146205715.
Thanks @aglagla. I appreciate any feedback :smiley:
Thanks @mre I'll check this now and let you know.
@mre I cloned the updated project and just build the Dockerfile. During the build it's complained about many issues. Im coping part of them
warning: no files found matching 'README.md'
no previously-included directories found matching 'documentation/_build'
zip_safe flag not set; analyzing archive contents...
six: module references __path__
In file included from ext/_yaml.c:343:0:
ext/_yaml.h:6:0: warning: "PyUnicode_FromString" redefined
#define PyUnicode_FromString(s) PyUnicode_DecodeUTF8((s), strlen(s), "strict")
^
In file included from /usr/local/include/python2.7/Python.h:85:0,
from ext/_yaml.c:16:
/usr/local/include/python2.7/unicodeobject.h:281:0: note: this is the location of the previous definition
# define PyUnicode_FromString PyUnicodeUCS4_FromString
^
ext/_yaml.c: In function ‘__pyx_pf_5_yaml_get_version_string’:
ext/_yaml.c:1346:17: warning: assignment discards ‘const’ qualifier from pointer target type
__pyx_v_value = yaml_get_version_string();
......
no previously-included directories found matching 'doc/.build'
zip_safe flag not set; analyzing archive contents...
zip_safe flag not set; analyzing archive contents...
tests.influxdb.client_test_with_server: module references __file__
/usr/local/lib/python2.7/site-packages/setuptools/dist.py:285: UserWarning: Normalizing '2015.09.06.2' to '2015.9.6.2'
normalized_version,
zip_safe flag not set; analyzing archive contents...
certifi.core: module references __file__
error: six 1.10.0 is installed but six==1.9.0 is required by set(['influxdb'])
Maybe another packages should be updated?
You are right @panda87. Thanks for the hint. I hadn't updated the setup.py script with the new required packages.
This is fixed now. I'm loading them dynamically from the requirements.txt file. This avoids duplicate code and much confusion. Would you mind doing another test run? Thanks for your patience btw.
Thanks @mre I see a progress, im still getting the same errors but the images eventually built.
Now I'm getting this:
Waiting for Kafka connection at kafka:9092. This may take a while...
Waiting for InfluxDB connection at influxdb:8086. This may take a while...
Starting to consume messages
Reading config file config_example.yaml
Traceback (most recent call last):
File "/usr/local/bin/kafka_influxdb", line 9, in
My CollectD logs are JSON, there is a problem with that?
btw, 2 questions.
D.
Yeah, so the errors are actually C compiler warnings from a dependency, namely PyYAML 3.11. So there's not much we could do here, except maybe fixing that in upstream or switching to another library. For now it's fine I guess.
The CollectD JSON format is currently not implement, but if you send me a couple of sample messages, I can have a look and maybe write an encoder for it. Maybe we can handle this in a separate issue.
As for your questions:
I created new issue for the JSON encoder.
Thanks for your time @mre D.
Hi @mre
Finally got some time to test and, after pulling the new release and installing it (thanks for the setup.py update), ran my test with the same config.yaml as before in debug mode with a small set of collectd metrics and everything works perfectly. Checked influxdb and as expected all metrics are there.
Thanks again for your responsiveness, much appreciated.
Alexis
On Wed, Oct 7, 2015 at 7:39 PM, Matthias Endler notifications@github.com wrote:
Thanks @aglagla https://github.com/aglagla. I appreciate any feedback [image: :smiley:]
— Reply to this email directly or view it on GitHub https://github.com/mre/kafka-influxdb/issues/5#issuecomment-146272565.
Nice! Thanks for your feedback. Closing.
Kafka jar : kafka_2.11-0.8.2.1.jar Python mods : kafka-influxdb==0.5.0 kafka-python==0.9.4 config.yaml :
On a simple local configuration, kafka broker is on localhost:9092, influxdb and collectd running as well. Collectd is sending messages to kafka (checked with kafka-console-consumer.sh) Influxdb accepting requests. However, when starting kafka_influxdb as follows (foreground mode) kafka_influxdb -c ./config.yaml -vvv
lsof -nn on the process gives the following output :
It doesn't seem to be talking to kafka nor consuming any messages.
Any help would be appreciated.
Regards,
Alexis