logstash-plugins / logstash-input-couchdb_changes

This plugin captures the _changes stream from a CouchDB instance
Apache License 2.0
27 stars 22 forks source link

_id doesn't make it through to elasticSearch #12

Closed Grokling closed 9 years ago

Grokling commented 9 years ago

I have Logstash consuming CouchDB's _changes api, and outputting to ElasticSearch.

The _id field from the original CouchDB document is lost on it's way into ElasticSearch. Underscore fields are ES meta fields. ES assigns a new _id, and the original is lost.

My use case is that I should have a set of results coming back from ES, then my client can subscribe (by _id) to the one they want to replicate (from couchDB). Without the _id, this scenario falls flat.

There is a keep_revision flag https://www.elastic.co/guide/en/logstash/current/plugins-inputs-couchdb_changes.html#plugins-inputs-couchdb_changes-keep_revision

Could we have either a keep_id flag (similar to keep_revision), or a means to map an incoming '_id' to an outgoing 'id' property in Logstash in order to avoid the underscore prefix overlap and allow ES to retain the key detail of the document?

untergeek commented 9 years ago

The document id is not passed because you need to tell the elasticsearch plugin which field to pass. See: https://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html#plugins-outputs-elasticsearch-document_id

The document id from couchdb is automatically put in the [@metadata][_id] field. Your elasticsearch output needs to include: document_id => %{[@metadata][_id]} (though perhaps the %{} wrapper is unnecessary).

Grokling commented 9 years ago

Thanks for the speedy reply - sadly I'm not winning yet. I've tried these all, but each is invalid. document_id => %{[@metadata][_id]} document_id => {[@metadata][_id]} document_id => [@metadata][_id] document_id => @metadata._id

I can get document_id => "@metadata._id" to work, but obviously that's not very useful!

The full config is: elasticsearch { host => "elasticsearch" port => "9200" protocol => "http" index => "hx" document_id => %{[@metadata][_id]} }

klingsor83 commented 9 years ago

I use document_id => "%{[@metadata][_id]}" and it works correctly. Maybe you only need to put the quotes...

Grokling commented 9 years ago

klingsor83 - that did it. Thanks very much.

untergeek commented 9 years ago

Sorry for omitting the "". I was trying to comment via my phone, and it's hard to get all of that punctuation in there and get a good overview of what you're writing.