couchbase / sync_gateway

Manages access and synchronization between Couchbase Lite and Couchbase Server
https://www.couchbase.com/products/sync-gateway
Other
447 stars 138 forks source link

Implement all required methods in GoCBBucket and remove GoCB Hybrid bucket #2423

Closed tleyden closed 7 years ago

tleyden commented 7 years ago

The GoCB hybrid bucket was basically a crutch to scope limit the transition.

However, certain things that require type assertions get complicated, and it's probably better to drop the gocb hybrid bucket.

For example, when dealing with error handling, you might not know which bucket was actually used for the call to Couchbase Server, and so trying to figure out which kind of error it is gets complicated.

From last jenkins runs, looks like there is only one failing functional test at this point.

Failing functional tests

Base CC

Topology specific CC

Base DI

NOTE: test_continuous_changes_sanity[sync_gateway_default_functional_tests-10-10 failed in cen7-sync-gateway-functional-tests-base-di/496 but passed in a later rerun. More reruns in progress: rerun2 rerun3

Topology specific DI

tleyden commented 7 years ago

The functional tests are passing, looking into this test failure locally:

pytest -s -p no:sugar --mode=di --server-version=4.6.1 --sync-gateway-version=5e5fd9c835494586cf9d273138c1791bba843fb5 -k test_server_goes_down_rebuild_channels testsuites/syncgateway/functional/topology_specific_tests/

Mobile-testkit logs: https://gist.github.com/tleyden/31a233acf78ad60240058d6ffd09d920

tleyden commented 7 years ago

From last jenkins runs, looks like there is only one failing functional test at this point.

Failing functional tests

Base CC

Topology specific CC

Base DI

Latest run - 496

Topology specific DI

tleyden commented 7 years ago

PR feedback changes

On cc79e1c500c67b79779316b6828f44d61845d177, running this functional test locally:

pytest -s -p no:sugar --mode=di --server-version=5.0.0-2827 
--sync-gateway-version=cc79e1c500c67b79779316b6828f44d61845d177 
-k test_db_offline_tap_loss_sanity[bucket_online_offline/bucket_online_offline_default_dcp-100] 
testsuites/syncgateway/functional/tests/

Fails w/ these logs:

https://gist.github.com/tleyden/3072f68c1ec35ded82ad47719dd4c5b7

tleyden commented 7 years ago

The problem seems to be that after the https://github.com/couchbase/sync_gateway/commit/e796b55f7924531ce0de7355a69c336e53e3a349 change, the offline handling is not working in the above test scenario. It worked before https://github.com/couchbase/sync_gateway/commit/e796b55f7924531ce0de7355a69c336e53e3a349 because that was (erroneously) using the go-couchbase driver rather than the gocb driver when SG was acting as a channel index reader (feedtype=DCPSHARD).

After fixing that in https://github.com/couchbase/sync_gateway/commit/e796b55f7924531ce0de7355a69c336e53e3a349 there isn't any mechanism in place for Sync Gateway to look for the bucket going offline when using gocb in channel index reader mode.

There seems to be two main places where the bucket is detected to be offline:

  1. Starting the mutation feed in channel cache mode
  2. Kicking off BucketUpdater loop when creating CouchbaseBuckets via go-couchbase driver

The first one works for both gocb and go-couchbase, but only in channel cache mode

The second one only works for go-couchbase, and so it looks like we'll need to add similar functionality for the cases when using the gocb driver to connect to the couchbase bucket.

adamcfraser commented 7 years ago

I don't think this is necessarily a problem. The motivation for #1 was to allow Sync Gateways running in channel cache mode to detect when they'd lost the mutation feed, and go offline (since otherwise the feed was failing silently, and didn't result in any errors). This was particularly important because the mutation feed didn't reliably restart when the bucket became available again (unlike KV operations).

When running in channel index mode, I don't think there's the same requirement. If the bucket connection is lost, individual requests will fail with error messages, and should recover on their own when the bucket connection is reestablished.

adamcfraser commented 7 years ago

And just to restate - channel index readers shouldn't imply feedtype=DCPSHARD. It's more like feedtype=NONE, if anything.

tleyden commented 7 years ago

And just to restate - channel index readers shouldn't imply feedtype=DCPSHARD. It's more like feedtype=NONE, if anything.

Yes, this threw me off. Here is the config file that mobile test-kit uses for the sync gateway node:

https://github.com/couchbaselabs/mobile-testkit/blob/master/resources/sync_gateway_configs/bucket_online_offline/bucket_online_offline_default_dcp_di.json

    "databases":{
        "db":{
            "feed_type":"DCPSHARD",
            "channel_index":{
                "writer":false
            }
        }
    }
adamcfraser commented 7 years ago

That's a testkit issue. feed_type=DCPSHARD never needs to be specified in the config. Accel nodes are the only ones that use a sharded feed type, and they use that feed type by default (it doesn't need to be specified in the config).