Closed tleyden closed 8 years ago
Here is a test failure capture:
https://gist.github.com/tleyden/07b586056db42746c528d5c6c78d8d75
I'm baffled as to why the longpoll changes request doesn't return a doc.
I noticed that it the longpoll changes feed was started at the same time as the doc was added
Could this be hitting a subtle race condition?
I should mention that the test only fails sporadically, which might be indicative of a race condition.
I added an attempted workaround to sleep for 5 seconds before pushing the document:
# Kick off continuous replication
sg1.start_push_replication(
sg2.admin.admin_url,
DB1,
DB2,
continuous=True,
use_remote_source=True,
use_admin_url=True
)
# Sleep for a while -- attempt to workaround https://github.com/couchbase/sync_gateway/issues/1763#issuecomment-219901067
time.sleep(5)
# Add docs
doc_id = sg1_user.add_doc()
logging.debug("Added doc {} to sg1, waiting til it syncs on sg2".format(doc_id))
# Wait til all docs sync to target
wait_until_docs_sync(sg2_user, [doc_id])
logging.debug("doc {} sync'd to sg2".format(doc_id))
Here is the capture from a successful run:
https://gist.github.com/tleyden/78b3c24b39f9ea6067248a429b28b930
The workaround above seems to be reliably working around the problem. I have run the test in a loop 20+ times and it hasn't produced any failures.
Also, I was able to reproduce this issue against the master branch of sync gateway:
http://uberjenkins.sc.couchbase.com:8080/view/QE%20Dev/job/syncgateway-functional-tests-dev/49/
meaning that it was not caused by the changes for this issue, and should not block the PR.
Fixed by #1781
In this build there was a data race detected:
Also, I enabled
-race
on the sg-replicate repo and it found a different data race: