couchbase / sync_gateway

Manages access and synchronization between Couchbase Lite and Couchbase Server
https://www.couchbase.com/products/sync-gateway
Other
448 stars 138 forks source link

SG Replicate stops when an attachment with no content type is replicated #2381

Closed ArihantRk closed 7 years ago

ArihantRk commented 7 years ago

Sync Gateway version

1.3.1

Operating system

RedHat 7.2

Config file

"replications":[ { "replication_id":"ptxdata_C1_C2", "source": "ptxdata", "target": "http://prod-SYNCGWREP-V1.service.2.com:4985/ptxdata/", "continuous":true, "changes_feed_limit":1000, "filter":"sync_gateway/bychannel", "query_params":["*"] }

Error Log output

2017-03-13T18:20:33.469+05:30 Enabling logging: [Bucket CRUD CRUD+ HTTP HTTP+ Access Cache Shadow Shadow+ Changes Changes+ Events Events+] _time=2017-03-13T18:20:33.495+05:30 _level=INFO _msg= Trying with selected node 1 _time=2017-03-13T18:20:33.495+05:30 _level=INFO _msg= Trying with http://192.168.193.136:8091/pools/default/bucketsStreaming/ptxdata _time=2017-03-13T18:20:33.505+05:30 _level=INFO _msg=Got new configuration for bucket ptxdata _time=2017-03-13T18:20:33.523+05:30 _level=INFO _msg= Trying with selected node 1 panic: runtime error: index out of range

goroutine 107 [running]: github.com/couchbaselabs/sg-replicate.Replication.fetchBulkGet(0xc82019ae10, 0xd, 0xc82015de80, 0xc82019ae20, 0x7, 0xc82019b600, 0x1, 0x1, 0xc82015de00, 0xc82018acb6, ...) /home/couchbase/jenkins/workspace/sgw-unix-build/1.3.1/community/godeps/src/github.com/couchbaselabs/sg-replicate/synctube.go:472 +0x36a2 created by github.com/couchbaselabs/sg-replicate.stateFnActiveFetchRevDiffs /home/couchbase/jenkins/workspace/sgw-unix-build/1.3.1/community/godeps/src/github.com/couchbaselabs/sg-replicate/replication_state.go:159 +0x6f1

goroutine 1 [IO wait]: net.runtime_pollWait(0x7fe5259a4680, 0x72, 0xc820098080) /usr/local/go/src/runtime/netpoll.go:157 +0x60 net.(pollDesc).Wait(0xc8206e3b80, 0x72, 0x0, 0x0) /usr/local/go/src/net/fd_poll_runtime.go:73 +0x3a net.(pollDesc).WaitRead(0xc8206e3b80, 0x0, 0x0) /usr/local/go/src/net/fd_poll_runtime.go:78 +0x36 net.(netFD).accept(0xc8206e3b20, 0x0, 0x7fe5259a5528, 0xc82028ede0) /usr/local/go/src/net/fd_unix.go:408 +0x27c net.(TCPListener).AcceptTCP(0xc820278598, 0x452c30, 0x0, 0x0) /usr/local/go/src/net/tcpsock_posix.go:254 +0x4d

Expected behavior

sg-replicate should restart

Actual behavior

Not starting with above log error

Steps to reproduce

Issue is inconsistent. currently system is in same state. would be happy to provide if any info need.

adamcfraser commented 7 years ago

Based on the point of failure, this looks like an error when trying to handle an attachment that doesn't have a content type specified:

https://github.com/couchbaselabs/sg-replicate/blob/9313a67a9234d96f3073a979b22bea5b322b56d6/synctube.go#L472

adamcfraser commented 7 years ago

@tleyden I don't know why an attachment wouldn't have a content-type defined - we may need to investigate that as a separate issue - but we should improve the handling in sg-replicate to log a warning and skip the attachment instead of crashing.

tleyden commented 7 years ago

@adamcfraser yup, agreed.

Thanks @ArihantRk for the exemplary bug report!

ddash1 commented 7 years ago

Any updates on the blocker issue?

adamcfraser commented 7 years ago

@ddash1 As a workaround until this is fixed, I'd recommend reviewing your data to identify any attachment references that are missing a content-type, and doing a manual cleanup.

tleyden commented 7 years ago

@ajres

Add unit test where it simulates an attachment without a content-type defined (easy)

I think you should be able to duplicate this unit test and remove the content type, and possibly repro the issue.

(I'm assuming this attachment is coming over a multipart related attachment)

ajres commented 7 years ago

PR created in sg_replicate, https://github.com/couchbaselabs/sg-replicate/pull/50

tleyden commented 7 years ago

@ajres merged! Can you submit a 2nd PR against Sync Gateway to point to the updated sg-replicate commit?

Also, can you kick off these functional tests against that SG PR commit?

tleyden commented 7 years ago

@ajres n/m, I opened the PR here: https://github.com/couchbase/sync_gateway/pull/2450