Sync problem with Cloudant databases - puller or change tracker?

lenkawell commented 10 years ago

We're having a problem with some user databases on Cloudant. The number of updates to process increases, seemingly without end. For small databases without much life-time activity (e.g., a few dozen documents), the progress count can reach 10,000s or 100,000s.

We've had one report of this problem with the TouchDB version of our app too.

I'm wondering if the change tracker is getting an end of input stream and continually restarting, which doesn't appear to be how it is designed to work.

Any ideas on what might be happening?

Here's a partial log from one of our user's iPhones: https://gist.github.com/lenkawell/4ce84c74eb28697dfc9f

lenkawell commented 10 years ago

Any thoughts on how to debug or fix this problem? We have some of our beta test users running down their batteries due to this.

snej commented 10 years ago

In that log, the ?since= parameter in the changes feed request is never changing. That's bad; it'll create an infinite loop. The client is definitely receiving changes from the feed, though, since the denominator of the progress keeps increasing.

snej commented 10 years ago

I can't see how this could happen unless something's wrong with the feed sent by the server. The log shows there's a single CBLSocketChangeTracker instance running the whole time. The tracker updates its lastSequenceID every time it sends a change entry to the replicator (CBLChangeTracker.m:251), so it can't be stuck on the same sequence ID over and over.

The only explanation I can see is that the incoming feed has "272-g1AAAA..." as the "seq" property of the last item in the feed. But that's illegal since the request is asking for everything since that sequence.

You can add a line at CBLChangeTracker.m:249 to log the sequence ID:

LogTo(ChangeTracker, @"%@: Read sequence '%@' for doc %@", self, sequence, docID);

Then we can see the sequence IDs that are in the feed and confirm whether they're invalid or not.

lenkawell commented 10 years ago

I'll add the logging and post the results. Thanks!

lenkawell commented 10 years ago

Here's a log with the additional logging:

https://gist.github.com/lenkawell/b0eff15c1645d6dcc27d

snej commented 10 years ago

Yikes — in the future can you create a separate gist (or pastie or whatever) for huge logs, and just post a link here?

But yes, this pretty clearly shows the problem's on Cloudant's end. The client asks for changes since sequence 272-g1AAAA... and the server sends back a few dozen changes ending with 272-g1AAAA. That's a recipe for an infinite loop, since the client is going to take that last sequence and send it as the since parameter to the next request.

I think you'll need to show this log to Cloudant's tech support. (It's fine if they want to contact me directly about it.)

lenkawell commented 10 years ago

Thanks for looking into it. I've forwarded the log and info to Cloudant support. And thanks for offering to let them contact you. I'll let you know what I hear back.

(And sorry for the big log file - I'll use a gist in the future.)

lenkawell commented 10 years ago

From: "Cloudant Support" support@cloudant.com Date: July 29, 2014 at 7:50:33 AM PDT To: "Len Kawell" len.kawell@timeboxapp.com Subject: Re: (Case 32781) _changes feed problem

Hi Len --

The _changes feed guarantees that any changes since seq_number will be included in the response. It does not guarantee that changes before seq_number will NOT be included. The last_seq value will stabilize when no changes have been made to the database.

The important question here is whether CBL is noticing that the _changes feed has ended. From our look at the code base, it should be detecting this, but clearly is not. I would recommend raising this to CBL.

Finally, I did want to note the importance of comparing seq_number strings in their entirety. I say this because you mentioned identical values beginning with '272-g1AAAA' in your initial email. It is certainly possible to have seq_numbers that differ only in their final few characters; the number at the beginning is largely irrelevant.

Let me know if you have any questions.

Thank you,

snej commented 10 years ago

The _changes feed guarantees that any changes since seq_number will be included in the response. It does not guarantee that changes before seq_number will NOT be included.

Yes, I know that. The problem here is that the last sequence sent is one that comes before the since parameter.

The important question here is whether CBL is noticing that the _changes feed has ended. From our look at the code base, it should be detecting this, but clearly is not.

It does detect the end or it wouldn't be sending another request. (And detecting the end shouldn't be important anyway; the replicator needs to work with continuous changes feeds too, which have no end.)

mikerhodes commented 10 years ago

The _changes feed guarantees that any changes since seq_number will be included in the response. It does not guarantee that changes before seq_number will NOT be included.

Yes, I know that. The problem here is that the last sequence sent is one that comes before the since parameter.

From what I could see the changes feeds were not sending last_seq values that were before since values. It was that CBL was seeing duplicate entries in the changes feed that the CBL database had already incorporated. Or did I misread the logs?

The important question here is whether CBL is noticing that the _changes feed has ended. From our look at the code base, it should be detecting this, but clearly is not.

It does detect the end or it wouldn't be sending another request. (And detecting the end shouldn't be important anyway; the replicator needs to work with continuous changes feeds too, which have no end.)

Here I meant, that the results from the sequence of _changes feed requests were the final ones required so further requests were not needed, not that a single request finished, I should've been more clear.

I couldn't spot the exact code which does this job in CBL -- could you point me to it?

snej commented 10 years ago

Hi, Mike. Thanks for jumping in.

I think you are misreading the logs. I know that "before" and "after" are blurry concepts in a clustered system like Cloudant, but the feed does look wrong to me because (a) the numeric prefixes (which I'm guessing correspond roughly to sequence numbers) are all lower than in the since= parameter, and (b) the final sequence in the response is identical to the since= parameter.

Below I've taken part of the log, from after it enters the steady state of the loop, and shown the request and all the returned sequences, with the long base64 parts truncated. Hopefully this makes it more clear.

Started... https://app.cloudant.com/personal-database-name/_changes?feed=longpoll&heartbeat=300000&style=all_docs&since=272-g1AAAAFFeJyVzks...
Read sequence '246-g1AAAAFFeJyVj9EKgkAQ...
Read sequence '249-g1AAAAFFeJyVj0EKwjAQ...
Read sequence '209-g1AAAAFoeJyVjtEKgjAU...
Read sequence '210-g1AAAAFoeJyVjkEKgzAQ...
Read sequence '230-g1AAAAFoeJyVjtEKgjAU...
Read sequence '231-g1AAAAFoeJyVjtEKgjAU...
Read sequence '232-g1AAAAFoeJyVjtEKgjAU...
Read sequence '233-g1AAAAFoeJyVjtEKgjAU...
Read sequence '234-g1AAAAFoeJyVjk0KwjAQ...
Read sequence '235-g1AAAAFoeJyVjtEKgjAU...
Read sequence '236-g1AAAAFoeJyVjtEKgjAU...
Read sequence '237-g1AAAAFoeJyVjtEKgjAU...
Read sequence '238-g1AAAAFoeJyVjtEKgjAU...
Read sequence '239-g1AAAAFoeJyVjlsKgzAQ...
Read sequence '240-g1AAAAFoeJyVjkEKgzAQ...
Read sequence '241-g1AAAAFoeJyVjtEKgjAU...
Read sequence '242-g1AAAAFoeJyVjkEKgzAQ...
Read sequence '243-g1AAAAFoeJyVjkEKgzAQ...
Read sequence '244-g1AAAAFoeJyVjkEKgzAQ...
Read sequence '245-g1AAAAFoeJyVjtEKgjAU...
Read sequence '246-g1AAAAFoeJyVjtEKgjAU...
Read sequence '247-g1AAAAFoeJyVjtEKgjAU...
Read sequence '248-g1AAAAFoeJyVjtEKgjAU...
Read sequence '249-g1AAAAFoeJyVjk0KwjAQ...
Read sequence '250-g1AAAAFoeJzLYWBgYMlg...
Read sequence '251-g1AAAAFoeJyVjkEKwjAQ...
Read sequence '252-g1AAAAFoeJyVjkEKwjAQ...
Read sequence '253-g1AAAAFoeJyVjkEKwjAQ...
Read sequence '254-g1AAAAFoeJyVjtEKgjAU...
Read sequence '255-g1AAAAFoeJyVjtEKgjAU...
Read sequence '256-g1AAAAFoeJyVjkEKwjAQ...
Read sequence '257-g1AAAAFoeJyVjtEKgjAU...
Read sequence '258-g1AAAAFoeJyVjtEKgjAU...
Read sequence '259-g1AAAAFoeJyVjtEKgjAU...
Read sequence '262-g1AAAAFoeJyVjtEKgjAU...
Read sequence '263-g1AAAAFoeJyVjtEKgjAU...
Read sequence '264-g1AAAAFoeJyVjtEKgjAU...
Read sequence '265-g1AAAAFoeJyVjtEKgjAU...
Read sequence '267-g1AAAAFoeJyVjlsKwjAQ...
Read sequence '268-g1AAAAFoeJyVjtEKgjAU...
Read sequence '269-g1AAAAFFeJzLYWBgYMlg...
Read sequence '270-g1AAAAFFeJzLYWBgYMlg...
Read sequence '271-g1AAAAFFeJzLYWBgYMlg...
Read sequence '272-g1AAAAFFeJzLYWBgYMlg...
EndEncountered <__NSCFInputStream: 0x14d51d20>

Here I meant, that the results from the sequence of _changes feed requests were the final ones required so further requests were not needed

No, this is a continuous replication so it will always send another request, with feed=longpoll, to get notified of future changes.

willholley commented 10 years ago

Hi Jens,

In clustered CouchDB / Cloudant, you are only likely to get an empty set in response to _changes when the source database is empty. When the database is in a steady state (i.e. there are no new writes), you are still going to get some trailing entries in changes.

I think what Mike was trying to express is that the log looks correct to us - given the a database with sequences numbers 1,2,3,4,5,6,7,8,9, a series of _changes calls might looks like:

_changes => [1,2,3,4,5,6,7,8,9] _changes?since=9 => [6,7,8,9] _changes?since=9 => [6,7,8,9] ....

after some writes are performed:

_changes?since=9 => [6,7,8,9,10,11] _changes?since=11 => [9,10,11] _changes?since=11 => [9,10,11] ....

We'd expect CBL to treat the duplicate results as no-ops given that it has already seen them.

Also, for clarity, the numerical prefix to the sequence number is essentially meaningless in Cloudant - it's not unexpected to see this unordered. The encoded string afterwards is the part that matters.

Cloudant (and CouchDB post-BigCouch merge, presumably) adds a "pending" field to the _changes response which tells you how many changes follow the response so you can know that you've reached the end of the feed.

snej commented 10 years ago

In clustered CouchDB / Cloudant, you are only likely to get an empty set in response to _changes when the source database is empty.

That would completely break the longpoll mode (which is what we're using). The whole point of longpoll is that the HTTP connection stays open but idle until new changes arrive.

Are you saying that Cloudant has deliberately broken compatibility with longpoll, and clients can only use continuous mode? We've had to abandon continuous because it has problems with various HTTP client libraries and telco gateways.

willholley commented 10 years ago

Sorry - ignore the above. If you pass a correct "since" parameter you should not generally see rewinds except in rare circumstances. There is a bug in Cloudant at the moment where last_seq might not match the last sequence number returned and this is likely causing the looping behaviour.

@lenkawell we'll keep you posted via your Cloudant support ticket. @snej apologies for the wild goose chase - it sounds like this issue can be closed.

snej commented 10 years ago

Whew! :relieved: Sounds like the Cloudant bug will be fixed soon, so I won't have to start thinking about a workaround?

snej commented 10 years ago

I'll leave this issue open until Len can confirm that the problem's gone away.

lenkawell commented 10 years ago

Jens, I'll put another comment here when I hear from Cloudant that it's fixed. Thanks very much for your work looking into the issue.

lenkawell commented 10 years ago

Here is the latest from Will Holley at Cloudant:

Our database team have identified the problem but we're still waiting on an ETA for the fix.

The problem appears to be that the value returned in last_seq does not represent a valid value that can be used as a "since" parameter in a subsequent request to _changes. A fix on the client would mean ignoring any last_seq value returned from the server and using the last "seq" value in the set of changes returned instead.

At a glance, I can't actually see where CBL uses last_seq but it might be worth going back to Jens to see if there is a quick workaround / fix.

lenkawell commented 10 years ago

After reading over the code in CBLChangeTracker.m it appears to me that CBL doesn't even use the last_seq value. CBLChangeTracker appears to only iterate through the returned results array and save the last sequence number that it processed. Is my understanding correct? If so, then I think that Will and Cloudant's assessment of the problem may be incorrect.

rnewson commented 10 years ago

Hi @jens and @lenkawell,

The clustered changes feed of Cloudant (and couchdb >= 2.0) has slightly different properties than couchdb < 2.0. Specifically, it can include changes from before the since parameter. This is the case if any of the original shards used in the feed are no longer available (since the order of updates within shard replicas is not identical). A substitution to a replica is then made, which may cause us to reset the since value of that specific shard to zero (though, generally, to the last checkpoint from the internal synchronising process). This behavior is essential and not a bug.

/dbname/_changes?since=FOO is guaranteed to include all changes since FOO but may include already seen changes from before. This can cause clients (and the replicator is the primary one) to potentially perform redundant work (checking to see if the target contains the change, only to find that it does).

Not using the "last_seq" value is fine, I believe. A continuous replication process generally never sees a last_seq anyway (it can but only in the event of a timeout) and those do not cause infinite loops at Cloudant (and we have quite a few of those going and no internal reports).

I'm curious as to the problem here and am certainly willing to help track it down. I do find the enormous inline logs in this ticket a bit of a pain, is there a chance they could be moved out so we can scan the non-log content more easily?

lenkawell commented 10 years ago

Robert, Thank you for offering to help track down the problem. I've moved the big logs to gists (as Jens suggested) - sorry about that. Please let me know what else I can do to help figure this out.

willholley commented 10 years ago

I'm assuming the last_seq problem that Cloudant currently exhibits is a red herring. As I understand it, the issue is that CBL does not recover correctly if its locally stored sequence number for a replication (passed as the "since" parameter to _changes) becomes invalid e.g. due to shard reconfiguration of the remote database.

From my reading of the code, CBL will only update its internal state if a call to _changes results in new documents / revisions to process. If no changes have occurred in the remote database since the last sync AND the "since" parameter is not recognised by the remote (i.e. all changes are returned but CBL doesn't need to process any of them), no progress is made and we get into the infinite loop @lenkawell describes.

The CBL fix would be to always update the last known seq following a call to _changes, even if no new changes are processed by the client.

lenkawell commented 10 years ago

If, as Robert says, a replicator can potentially see changes from prior to the since= value, then how should the replicator record it's own most recent change to include in the next since= value?

In other words, if the last sequence number received in the changes feed is not necessarily the most recent one in the database, then does the replicator need to compare each sequence number received to it's own saved most recent sequence number to determine if it's the latest? If so, is a collated string compare correct (especially now that CouchDB is using complex JSON sequence "numbers")?

willholley commented 10 years ago

The sequence numbers should be treated as opaque values - the replicator should just store the sequence number associated with the last entry returned in the _changes result.

rnewson commented 10 years ago

Let's first be clear on terms. the since value is not a number (and, as Will has noted, the number on the front is meaningless, it exists solely to trick legacy replicator code into checkpointing correctly).

Hopefully an example will clear that up;

A seq value of "1-g1AAAABLeJzLYWBgYMxgTmHgz8tPSTV2MDQy1zMAQsMcoARTIkMeC8N_IMjKYE5kzAUKsRuZJJqZJlpANRgiNGQBAKI1EnY" is, after discarding the useless "1-" prefix, an encoding of this input value;

[{'node3@127.0.0.1',[0,4294967295],{2,<<"24a65a8">>}}]

This particular since value is from a database with only one shard (q=1) and one replica (n=1). Others, including yours, have more (q is typically 4 and n is invariably 3).

This is why a text comparison of two since values is not guaranteed (or even likely) to determine true equivalency (or order) and hopefully explains why your second paragraph's ("In other words..") talk of comparing and "most recent" is not right.

With that said, the "seq" value is intended to be opaque. You don't need to know what it means, and you definitely can't compare them with each other, nor do you need to. For couchdb 2.0 we are considering a readable (but equally treat-me-opaquely) format, to reduce some of the mystery around this.

What the replicator does is what CBL says its does, and I've no reason to doubt @jens on that. Periodically, it stores the "seq" value in a checkpoint (a _local document, which does not appear in the changes feed). If the replication processes crashes, this value is used as the since= parameter when it resumes from the checkpoint. Note that the replicator process does not read the "seq" value from the checkpoint document at any point except when it starts; it's part of the replicator processes state instead. This might go some way to explaining why it doesn't see 'loops'. For non-continuous replication, the replication process eventually ends (it only catches up to the latest sequence as of the start of the _changes call, nothing after) and so also does not loop.

rnewson commented 10 years ago

for continuous replication, the _changes request is open to the server throughout, and so you only see new updates (internally it's a pub/sub deal), so it's also not going to see loops.

lenkawell commented 10 years ago

Robert, thanks for the info on how it works on the database side - very helpful (especially to me).

It appears that in this case on the client side CBL is using longpoll mode. CBL issues a GET _changes with since=272..., gets results back and EOF. The last result it received is seq=272... CBL then issues a new GET with the since=272.... The GET again completes with changes within a few msec, EOF. The last result it received is (again) seq=272... CBL then issues a new GET with the since=272.... From the CBL log this seems to be the loop.

https://gist.githubusercontent.com/lenkawell/b0eff15c1645d6dcc27d/raw/bfda731b02fbb6026c48719b59347ca8de32250d/CBLiOS390-2

rnewson commented 10 years ago

ah, interesting. longpoll would explain it. That would get you up to date then wait for the next change or the timeout and then exit. Assuming CBL would then re-issue the _changes?since= call on the end of the response, I could see endlessly replaying a bit of the 'catch up' phase that feed=continuous avoids.

willholley commented 10 years ago

I think what's happening here is that CBL is making a request e.g.

/_changes?since=272-g1AAAAFF....slmcA

the result is a full set of changes (none of which are new to the client), ending in

{
  "seq": "272-g1AAAAFFeJzLYWBgYMlgTmFQTElKzi9KdUhJMjTRy00tyixJTE_VS87JL01JzCvRy0styQEqZUpkSLL___9_VgZTolUuUIDd1NIiMdXUCM0IQzxGJDkAyaR6qCluYFPMLcyTk00NQKYowE2xwGNIHguQZGgAUkBz9oMMSgQbZJmUmJRibE6CcyAmHYCYBHQSc6I-2CQTAyNLc4NUZAfhCJgsADrjZnA",
  "id": "2013-07-24T00:48:07Z-BE5D3CE3-377D-49AC-AA7B-DBFEF060AA73",
  "changes": [
    {
      "rev": "2-11afbf86d2118b371de84391f4c81483"
    }
  ]
}

Instead of making another _changes request with since=272-g1A....ADrjZnA (which would return an empty set), CBL continues to use the original "since" value 272-g1AAAAFF....slmcA as the last_seq, hence the looping situation. This is because no new changes were returned in the feed so the local last_seq is not updated.

rnewson commented 10 years ago

oh, if true then that's totally a CBL-only bug.

lenkawell commented 10 years ago

Will, I don't think your description is quite right. The CBL replicator is simply saving the last seq number it receives in the _changes response results array. Then from what I can tell, the CBL replicator uses that saved seq value for the since= value in the next GET _changes.

Upon further examination of complete sequence numbers in the log, it looks like the "loop" is across groups of two GETs to _changes. The last seq returned in the first GET is "272-...lmcA". CBL saves it and uses that for the since= in the second GET.

The last seq returned in the second GET is "272-...hmbw" (both begin with 272, but the encoded value part is different). CBL saves this seq and uses it for the since= in the third GET.

However, last seq returned in the third GET is the SAME as the last one returned in the first GET: "272-...lmcA". CBL saves it and uses this value for the since= in the fourth GET. And it's Groundhog Day again... unless I'm misunderstanding what's happening.

I cleaned up the log so it's a bit easier to see just the returned seq values and doc IDs: https://gist.github.com/lenkawell/11791b680cdc92f47d3c

Is there something CBL should be doing here to prevent this cycle? Or is Cloudant in some kind of loop?

rnewson commented 10 years ago

to assist;

"272-...hmbw" decodes to;

[{'dbcore@db14.meritage.cloudant.net',[0,1073741823],
                                  {58,<<"598ae52">>,'dbcore@db3.meritage.cloudant.net'}},
 {'dbcore@db11.meritage.cloudant.net',[1073741824,2147483647],
                                  {70,<<"787cc50">>}},
 {'dbcore@db8.meritage.cloudant.net',[2147483648,3221225471],
                                 {97,<<"9babd37">>}},
 {'dbcore@db11.meritage.cloudant.net',[3221225472,4294967295],
                                  {47,<<"402970e">>}}]

and "272-...lmcA" decodes to;

[{'dbcore@db14.meritage.cloudant.net',[0,1073741823],
                                  {58,<<"598ae52">>}},
 {'dbcore@db11.meritage.cloudant.net',[1073741824,2147483647],
                                  {70,<<"787cc50">>,'dbcore@db4.meritage.cloudant.net'}},
 {'dbcore@db8.meritage.cloudant.net',[2147483648,3221225471],
                                 {97,<<"9babd37">>}},
 {'dbcore@db11.meritage.cloudant.net',[3221225472,4294967295],
                                  {47,<<"402970e">>}}]

rnewson commented 10 years ago

https://github.com/cloudant/fabric/commit/3983e281952a987f961aa153d2f6ffa79772ebb3 is the patch that introduces the 4-tuple variants.

Given that the update seqs of each shard are the same and from the same node, the question is why is the first 4-tuple a 3-tuple in the second, and the second 3-tuple a 4-tuple.

I'm not sure it's the cause of the problem, but it's a head scratcher for me right now.

cc @kocolosk

rnewson commented 10 years ago

I should note that the fact that the {node, update_seq} values are the same in all cases means that the two sequences are equivalent, so if you get one of them from a since= using the other, then that's a loop.

cc @willholley

snej commented 10 years ago

The CBL replicator is simply saving the last seq number it receives in the _changes response results array. Then from what I can tell, the CBL replicator uses that saved seq value for the since= value in the next GET _changes.

That's correct, Len. And the replicator's checkpoint handling doesn't care whether the revisions it received were already known; it updates the checkpoint anyway. So after processing this batch of changes, the saved checkpoint (used as the next 'since' value) is the sequence of the last revision in the array.

The problem here is that this sequence ID is the same as the previous one. That's what causes the loop.

snej commented 10 years ago

To clarify: The CBL replicator does not pay attention to the top-level seq property of the response. It looks at the seq property of each individual change as it processes it. That way it can update its checkpoint to correspond to the last revision that it successfully pulled.

lenkawell commented 10 years ago

Robert, any progress finding a fix? Would using continuous mode instead of longpoll in CBL avoid the problem? Or would that end up in a loop too?

rnewson commented 10 years ago

I think the only reason continuous replication avoids the issue here is that it's much less likely to use any given "seq" value at all (the _changes?feed=continuous connection just keeps sending updates or blank-line heartbeats).

I'm no closer to proving the bug or finding a fix and I have other demands on my time. I've pinged some other folks to take a look. This isn't forgotten but it's not my top priority unfortunately.

lenkawell commented 10 years ago

We're REALLY trying to submit a new version of our app to the App Store. Jens or Robert, do either of you have any suggestions for a work-around until this gets fixed?

snej commented 10 years ago

The only workaround I can think of is to edit CBLSocketChangeTracker. When the entire feed's been read successfully, update the 'lastSequenceID' property to the top-level last_seq value from the feed, instead of the last item's seq.

rnewson commented 10 years ago

@lenkawell is this still a problem? If so, can you send me a full url that currently exhibits the loop? I tried the one from the original cloudant ticket and it seems not to happen there anymore.

I stress that we've not changed anything, so I'm assuming the database has been updated since and that's had the side-effect of fixing that particular loop.

lenkawell commented 10 years ago

Yes, at least as of August 27th, it was still a problem for one of our users and, as you've seen, the other user is no longer seeing the problem. I've updated the Cloudant support ticket with a log of the failure of the user's database who is still having the problem.

This has been the pattern that we've seen with this problem: it's a solid failure for weeks and then something happens, possibly some number of updates to the database, and the problem mysteriously "goes away." For example, it was happening for my own personal database for a couple weeks, during which I was making updates, but then something changed and it stopped happening and I haven't seen it again for weeks.

rnewson commented 10 years ago

@lenkawell thanks. I've used that database to capture all the seq values from the since= until the final row and confirmed that the final row is the same offset in all shards as the original, which yields the "loop". I captured every step in our bug tracker so we can puzzle over this even if that database is "healed" by an update.

pasin commented 9 years ago

@lenkawell has the issue described in this ticket been solved? Could you please let us know?

zgramana commented 9 years ago

Closing due to inactivity. Please re-file if it's not resolved.

couchbase / couchbase-lite-ios

Sync problem with Cloudant databases - puller or change tracker? #390