fangli / fluent-plugin-influxdb

A buffered output plugin for fluentd and InfluxDB
MIT License
111 stars 65 forks source link

Plugin should refuse to send empty tag values to InfluxDB Ruby client #37

Closed radcool closed 8 years ago

radcool commented 8 years ago

Considering that sending an InfluxDB line with empty tag values such as (notice the empty ident tag):

syslog,host=fwtpcore1b,ident= message=\"BLAHBLAHBLAH\" 1449233580

will result in InfluxDB returning a missing tag value error, I would say that the InfluxDB Ruby client should omit tags with empty values in the line altogether. https://github.com/influxdb/influxdb-ruby/issues/124 has been opened for this purpose.

That being said, ideally the Fluentd InfluxDB plugin should also prevent sending any empty tag to the Ruby client in the first place in order to catch the error as early as possible.

I have come upon this issue in cases where Syslog message were received with no ident field while having Fluentd output to InfluxDB.

nathanwebb commented 8 years ago

I'm using an output filter prior to sending it to this output filter, and added the following code to strip out any blanks before sending to fluent-plugin-influxdb:

record.delete_if { |key, value| value.to_s.strip == '' } 
radcool commented 8 years ago

Good idea @nathanwebb. Do you think though that this should still be a coded behaviour from within the Fluentd InfluxDB output plugin?

nathanwebb commented 8 years ago

Using forest has done the trick for me at the moment, but they caution against using if for "infinite" tags, so that might mean that there is a potential scaling issue with doing it that way. Using Forest, the configuration will grow over time, although I don't know if that limit is 10, 100, 1K or 1M tags :). On the other hand, using variable expansion within the influxdb plugin might not have this problem.

A potential issue might occur where some messages can't be sent. E.g. if you are using authentication with influxdb, and also use the fluentd tag to set the influxdb database name. If the user doesn't have write permissions on all of the databases, then some records can't be written. In that case, will the queue get blocked?

On 18 December 2015 at 00:11, Martin Gignac notifications@github.com wrote:

Good idea @nathanwebb https://github.com/nathanwebb. Do you think though that this should still be a coded behaviour from within the Fluentd InfluxDB output plugin?

— Reply to this email directly or view it on GitHub https://github.com/fangli/fluent-plugin-influxdb/issues/37#issuecomment-165448981 .

benschwarz commented 8 years ago

@radcool — I too want to see some activity around this :+1:

Here's a clear summary of what I'm seeing:

Here's what I think should happen:

Erroneous events should be discarded/dropped (with the preexisting error log entry), so that the buffer can still be processed.

Otherwise, encountering a single error blocks your entire event stream 😐

@fangli @repeatedly — Thoughts?

repeatedly commented 8 years ago

@benschwarz Do you use influxdb-ruby v0.2.4?

benschwarz commented 8 years ago

@repeatedly It was only released yesterday — No. What does 0.2.4 do to the above situation?

benschwarz commented 8 years ago

Here's what I think should happen:

Erroneous events should be discarded/dropped (with the preexisting error log entry), so that the buffer can still be processed.

An alternative to this would be dropping tags with nil values. (And storing the event). I'd be happy with that as an alternative.

repeatedly commented 8 years ago

@benschwarz New version merged this patch > https://github.com/influxdata/influxdb-ruby/pull/121 It escapes invalid values in the client side.

benschwarz commented 8 years ago

@repeatedly What does that mean? Valid values are escaped and it will cause what behaviour?

benschwarz commented 8 years ago

@repeatedly Still a bit unsure about what updating should do to fix this issue, but in my tests it does not fix anything. We need to make a change to this library.

I'm going make a PR that will:

[...] Dropping tags with nil values. (And storing the event).

If an event with the following appears:

syslog,host=fwtpcore1b,ident= message=\"BLAHBLAHBLAH\" 1449233580

Then the following will be stored:

yslog,host=fwtpcore1b message=\"BLAHBLAHBLAH\" 1449233580

Sound good @fangli @radcool ?

radcool commented 8 years ago

@benschwarz a fix for this behaviour was released just a few days ago on the InfluxDB Ruby client side (issue #124). I haven't tested that new release of the Ruby client, but if it fixes this behaviour then fixing it on the Fluentd InfluxDB plugin becomes "unnecessary". That being said, I still think your PR is valid since I don't see the harm in catching an "illegal" behaviour early in the chain.

benschwarz commented 8 years ago

@radcool

Since @repeatedly suggested that an update to 0.2.4 would help, I have upgraded fluent-plugin-influxdb to 0.2.4 and I have also upgraded influxdb to 0.2.4.

Events with errors are still going to be bundled up into a single request, and delivery to influxdb will still be attempted. There needs to be a change in this library too. The queue still can not be flushed once there is a tag without a value.

… Or have I missed something entirely? This issue is about "refusing empty tags", but ultimately, shouldn't it be about "fluent-plugin-influxdb blocks/will not be able to dispatch to influx after an event with an empty tag has entered the queue"?

repeatedly commented 8 years ago

Hmm... influxdb ruby should ignore nil values but it still doesn't it. If you send a patch, I will review it.

radcool commented 8 years ago

@benschwarz It's been a while since I played with Fluentd so I don't remember what behaviour I'd seen on the queue when tags missing values arrived, but I agree that this ocurrence shouldn't block the whole queue as basically one error ends up penalizing everything.

repeatedly commented 8 years ago

Fixed by #44