couchbase / sync_gateway

Manages access and synchronization between Couchbase Lite and Couchbase Server
https://www.couchbase.com/products/sync-gateway
Other
447 stars 138 forks source link

sync_gateway failing to reconnect when server goes down in multinode cluster #2153

Closed sethrosetter closed 7 years ago

sethrosetter commented 8 years ago

The Sync Gateway issue tracker is reserved for bug reports and enhancement requests. For general questions, please use the Couchbase forums: https://forums.couchbase.com/c/mobile/sync-gateway. Thank you!

Sync Gateway version

Couchbase Sync Gateway/1.4.0(16;3cdcd31)

Operating system

CentOS7

Config file

standard

Log output

https://gist.github.com/sethrosetter/9ee1f967b9a7a7499f2a564aba836dcb

Expected behavior

Sync Gateway would continue to operate as expected if a node in a multinode Couchbase Server cluster failes

Actual behavior

There are 2 nodes in the Couchbase Server cluster. One node is stopped via service couchbase-server stop causing the node appear as a failure in the Couchbase Server admin interface Trying to perform operations against sync_gateway fail. It appears that go-couchbase fails to adapt to the new topology

Steps to reproduce

  1. 2 node Couchbase Server cluster
  2. Create a bucket
  3. Perform a few PUTS to sg
  4. service couchbase-server stop on the non primary node
  5. Try more PUTS to sg
sethrosetter commented 7 years ago

I believe this is due to autofailover not set in Couchbase Server. After I set this option, sync_gateway was able to recover

sethrosetter commented 7 years ago

Clarification. Doc adds are successful after failover, however, the changes feeds are not returning all results.

sethrosetter commented 7 years ago

The original issue in this ticket was fixed when enabling autofailover. These changes feed issue is now being tracked here - https://github.com/couchbase/sync_gateway/issues/2197