logstash-plugins / logstash-input-couchdb_changes

This plugin captures the _changes stream from a CouchDB instance
Apache License 2.0
27 stars 22 forks source link

provide support for cloudant #26

Open snowch opened 8 years ago

snowch commented 8 years ago

Cloudant is based on the Apache-backed CouchDB project. Cloudant adds clustering to CouchDB and this functionality has been contributed back to Apache and will become part of CouchDB 2.0.

A few changes may be required to support Cloudant:

  1. Cloudant sequence identifiers can be any valid JSON type and not integers: https://docs.cloudant.com/database.html#get-changes.
  2. It is essential that consumers of changes feeds treat the updates they receive idempotently

It appears that to support item 1 - initial_sequence will need to be a Hash or Array couchdb_changes.rb#L65. The sequence will probably need to be parsed using JSON.parse() - couchdb_changes.rb#L163 to receive it as a Ruby Hash or Array.

It appears that item 2 should be supported for actions other than 'delete' as the metadata sets ElasticSearch to be updated if a document is reprocessed couchdb_changes.rb#208.

There may be an issue with re-processing deletes. If the delete has already occured in ElasticSearch - will the delete throw an exception couchdb_changes.rb#L204?

rophy commented 7 years ago

I really need that sequence identifier support for cloudant. Any updates on this?

kocolosk commented 7 years ago

For what it's worth this is no longer just a Cloudant issue; CouchDB 2.x (over a year old now) also has this non-integer sequence format.