logstash-plugins / logstash-input-couchdb_changes

This plugin captures the _changes stream from a CouchDB instance
Apache License 2.0
27 stars 22 forks source link

Include document contents on delete action #34

Open naqerf opened 8 years ago

naqerf commented 8 years ago

Hello,

First of all, thanks for your great work on logstash and this plugin, it works great!.

I have a feature I'd like to request/discuss. The scenario is couchdb_changes as input and elasticsearch as output. There are more than one type of document in the CouchDB database, classified by a json property in the doc (something like {"type": "my_doc_type"} and {"type": "my_other_doc_type"}). If you need to store the documents in different ES index types, you can set _documenttype in the elasticsearch output. This works great for document updates. The problem is when the document is deleted in CouchDB. Checking the source code for the plugin, I can see that when data['doc']['_deleted'] is true, data['doc'] is not included in the event. As the document type is part of data['doc'], the type property is not present in the fields and the elasticsearch output can't target the document index type for deletion. In CouchDB there are two ways to delete a document, using a DELETE method and updating the document with a _deleted property set to true. In the first case the document contents are not included in the _changes stream, but in the second case the doc is also included. This is used for filtered replication but I think it also can be used for targeting the ES index type.

Would be possible to include a configuration setting in the plugin for keeping the document in delete action and if set to true, include data['doc'] in the event? I can update the code and place a pull request if you think this feature is worthy to be included as part of the plugin.

Thanks!