fluent / fluent-plugin-mongo

MongoDB input and output plugin for Fluentd
https://docs.fluentd.org/output/mongo
173 stars 61 forks source link

Insert in mongos : operate_invalid_records #23

Closed Kaimlo closed 11 years ago

Kaimlo commented 11 years ago

All went ok after this patch https://github.com/fluent/fluent-plugin-mongo/issues/22#issuecomment-12267557 Until I get another error :

2013-01-15 17:49:50 +0100: temporarily failed to flush the buffer, next retry will be at 2013-01-15 17:49:43 +0100. error="String not valid UTF-8" instance=70329051781420
  2013-01-15 17:49:50 +0100: /usr/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/bson-1.6.4/lib/bson/bson_c.rb:25:in `serialize'
  2013-01-15 17:49:50 +0100: /usr/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/bson-1.6.4/lib/bson/bson_c.rb:25:in `serialize'
  2013-01-15 17:49:50 +0100: /usr/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/mongo-1.6.4/lib/mongo/collection.rb:972:in `block in insert_documents'
  2013-01-15 17:49:50 +0100: /usr/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/mongo-1.6.4/lib/mongo/collection.rb:971:in `each'
  2013-01-15 17:49:50 +0100: /usr/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/mongo-1.6.4/lib/mongo/collection.rb:971:in `insert_documents'
  2013-01-15 17:49:50 +0100: /usr/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/mongo-1.6.4/lib/mongo/collection.rb:353:in `insert'
  2013-01-15 17:49:50 +0100: /usr/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-mongo-0.6.11/lib/fluent/plugin/out_mongo.rb:152:in `operate_invalid_records'
  2013-01-15 17:49:50 +0100: /usr/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-mongo-0.6.11/lib/fluent/plugin/out_mongo.rb:124:in `operate'
  2013-01-15 17:49:50 +0100: /usr/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-mongo-0.6.11/lib/fluent/plugin/out_mongo.rb:112:in `write'
  2013-01-15 17:49:50 +0100: /usr/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.30/lib/fluent/buffer.rb:279:in `write_chunk'
  2013-01-15 17:49:50 +0100: /usr/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.30/lib/fluent/buffer.rb:263:in `pop'
  2013-01-15 17:49:50 +0100: /usr/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.30/lib/fluent/output.rb:303:in `try_flush'
  2013-01-15 17:49:50 +0100: /usr/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.30/lib/fluent/output.rb:120:in `run'

It appear sometimes while inserting and makes fluentd crash, when exclude_broken_fields option is set. I didn't find a way to dump insert requests that contain wrong UTF8

repeatedly commented 11 years ago

Hmm... shard key(exclude_broken_fields) seems broken. The simple solution is ignoring the document when exclude_broken_fields are broken.

What do you think?

repeatedly commented 11 years ago

Another solution is using but maybe this approach needs the handling improvement of fluentd itself.

Kaimlo commented 11 years ago

Hum, what's the best for you? The goal is to not lose insert request into mongo.

repeatedly commented 11 years ago

fluent-plugin-mongo is designed on the assumption that an incoming event is valid as BSON. It's a MongoDB limitation, so __broken_data approach is the last resort...

The best is secondary mongod or other output plugin is needed. Because if shard key is broken, out_mongo can't send the request to mongos. The problem is current Fluentd doesn't support the direct call.

Ignoring document is very simple solution. In fact, some users use ignore_invalid_document true to ignore. But it depends on the data type...

Kaimlo commented 11 years ago

Ok so I will just use ignore_invalid_document true for now.