Closed garethbowen closed 8 years ago
(my notes for now)
How long does it take for one curl statement that we know will roll over 10,000 documents? Single user, local machine, attached to mains.
return false
~4.7 for the last 10k (seq 28488 -> 38488)
return true
~5.4-5.8 seconds. We can presume this extra second is the time required to generate and move the JSON
~28 seconds.
~38 seconds.
Things we could do:
_all_docs
with this view and the appropriate keys to replicate first. Pros: docs don't change; view call should be lightning fast, Cons: we'd have to hack / lie to couch about its replication, which means screwing with a 'private' (in the sense that there is no contract against it) API to set some values that will trick couch into thinking it's replicated up to a point; having to deal with all of the edge cases that replication Just Does.From monitoring a non-filtered replication, CPU load on lg.app server goes from 0-10% to 15-30%, and up to 70% at times.
From eying it, it could be the CPU load peaks when the docs are being requested : replication doesn't use bulk queries, it issues a zillion individual GETs. We could benchmark the difference between bulk GET vs.
Another fix if that's it is to throttle the frequency of GETs a client can issue, to avoid having couch serving itself to death
I expect we'll see massive performance improvement when (and if) couchdb 2 gets released with the bulk get docs.
? No bulk get? There's all_docs
with keys=[aaa, bbb, ccc]
, same thing.
Replication doesn't use it.
To expand on that comment a little... have you ever noticed when hitting the app as a restricted user in a new browser you get a 400 for the URL: /medic/_bulk_get?revs=true&attachments=true
accompanied by the comment "The above 400 is totally normal. PouchDB is just detecting if the remote supports the _bulk_get API.". This is pouchdb doing feature detection for an API available in couchbase and couchdb v2. When this fails it does a whole bunch of single doc requests instead of using the all_docs
feature. I think this is because it wants the rev history. So when we get couchdb v2 it'll eliminate all the single doc requests by using the new _bulk_get API instead.
OK, so I've looked into replication using a view, and the gist is that it doesn't usefully exist.
It works by using a filter called _view
which reads the view
parameter from your request, runs the current doc through that view and if there are any emits the document returns. So it's just as slow as running a filter.
When you think about how replication works I suppose this makes sense, since it runs through the changes feed change by change, and so that's not really possible to map cleanly to a view. In theory we can write whatever Erlang we like, which means it might be possible to do crazy hax and access internals and make something that uses a pre-computed view. At that point we're essentially monkey-patching CouchDB…
This leaves us with filtered replication being the only way to restrict documents inside the same database, which isn't great because it really doesn't scale. As we add more data both initial replication times and continual sync times are going to increase exponentially (in the sense that each user adds not only a constant amount of extra documents but also N documents per month or whatever).
Bar anyone else's good ideas I think we need to plan to move off filtered replication completely. We can optimise the Erlang maybe and deploy more Couch nodes maybe, but in the end the relationship between users and replication performance just does not scale.
The two obvious ways to move off filtered replication is to have a view for every user, or a DB for every user. The former being less effort to move to, and the latter being more complicated but probably scales better (though perhaps one DB on multiple nodes would have similar scaling properties to N DBs).
I think this is because it wants the rev history
Yeah, Pouch doesn't use _all_docs
because that will only fetch the latest rev, while for replication it's fetching specific revs.
The two obvious ways to move off filtered replication is to have a view for every user, or a DB for every user.
I'm an idiot, a view for every user doesn't work, I just said that. So DBs is the only alternative here.
Or replace replication by view queries. Basically reimplementing replication. Which is probably not a good idea because of infinite finicky little edge cases.
So here is the plan: I'm going to look at the feasibility of baking in the list of who can access the document into the document on document save, on the client, and specifically what kind of performance benefit that brings:
_changes
)
^-- this, combined with my data above, should give us an idea of how much faster this could be. Based on this we might decide to:
all
to the list or whatever)And perhaps (but probably not):
medic_api
migration script that fills this in for every existing document. This would only be worthwhile if the speedup was intense, since it means effectively everyone has to start again and replicate from the beginning (since all documents get edited by the migration)Or consider:
So, not moving off filtered replication?
Plan sgtm! Last checkbox is covered by https://github.com/medic/medic-webapp/issues/2284, both are interesting IMO.
Yeah, the last checkbox is just to show a sort of direction of thought, not something I'd do in this ticket. I'll link it.
In terms of not moving off filtered replication, I've had some conversations with @garethbowen about this and basically it's not really possible. There are no good solutions to remove filtered replication that don't add huge amounts of mental madness. One thing we've talked about is a sort of midway point where we still use filtered replication, but we massively limit what users have to sift through using either multiple instances / deploys, or multiple databases, per thing, where thing might be a district. Which would mean it's still not perfect, but it limits the amount of scaling the filtering has to deal with (ie now it's however large a district gets, not however large the entire deploy gets).
OK so performance results are weird. They lead me to the conclusion that the prime factors that affect filter performance are:
(the section of documents tested is a different section, these results compare to each other but not to others in the rest of this ticket. 10,000 documents where all but 27 [cbf working it out] have the compiled permissions. Tested three scenarios: original code, shortcut that can fallback on the original code, and just a shortcut (where those 27 docs return false))
Code | Time | Notes |
---|---|---|
All JS | ~54s | Code changes didn't make a difference |
Erlang Original | ~29.5s | |
Erlang with Shortcut and Fallback | ~29s | Barely faster at all |
Erlang with Shortcut and No Fallback | ~10s | Much faster |
Extra long Erlang | ~78s | With shortcut and fallback, and then a bunch more Erlang pasted afterwards that gets parsed but never runs |
Extra short Erlang | ~9.5s | With shortcut and no fallback, all comments removed |
This lines up with my true and false tests being so much faster: I implemented those by deleting all the code and replacing them with true or false, thus making the code much shorter.
This is both fascinating and frustrating at the same time. It means two things:
^-- I uploaded my experimental filters and their results if anyone is interested / think I'm crazy.
I have also pinged the mailing list (I'll link once it appears on their web cache). This is really all we can do for now, once there is some movement on either / both of those I'll update.
In side-news, there appears to be a fork of couchdb that supports view replication: https://github.com/barrel-db/barrel . Not sure we are desperate enough to move to a fork for one feature, but there it is regardless.
OK but in other news I just saw this in the code: https://github.com/apache/couchdb-couch/blob/master/src/couch_changes.erl#L297-L319
I have no idea how to read Erlang, but it looks like there is an implication that (at least in master) there is a way to use views for replication and have it actually be fast.
OK so more digging:
Now when the option
seq_indexed=true
is set in the design doc, the view filter in _changes will use it to retrieve the results. Compared to the current way, using a view index will be faster to retrieve changes. It also gives the possibility to filter changes by key or get changes in a key range. All the view options can be used
Testing underway…
Yeah I'm not sure about this. I can add the option and nothing breaks, but I can't get the key option to also work. Looking through the code is… challenging.
I edited my view code so it logs whenever it's called, and adding those options doesn't stop the function from being called each time, which means we are no closer to running "fast" view code.
Mailing list thread: https://mail-archives.apache.org/mod_mbox/couchdb-user/201605.mbox/browser
I'm investigating a filtering workaround... report back soon. If you don't hear from me, send help!
About Couchdb's replication, a Couchbase Lite for ios developer says:
From my limited testing, the performance bottleneck in the current algorithm seems to be in fetching the new revisions from the source. I think this is due to the overhead of handling many separate HTTP requests. This was the rationale for adding _bulk_get in Couchbase Mobile.
A limited case of the above-mentioned bulk-get optimization is possible with the standard API: revisions of generation 1 (revision ID starts with "1-") can be fetched in bulk via _all_docs, because by definition they have no revision histories. Unfortunately _all_docs can't include attachment bodies, so if it returns a document whose JSON indicates it has attachments, those will have to be fetched separately. Nonetheless, this optimization can help significantly, and is currently implemented in Couchbase Lite.
from https://github.com/couchbase/couchbase-lite-ios/wiki/Replication-Algorithm Edit : that's for normal replication, not specifically filtered.
@estellecomment is that actually about the couchdb implementation, or about their own? The way I read that it sounded like performance notes for their own implementation, not an analysis or CouchDB's
Sounds like it's about Couchdb's:
These notes were derived from reading the API documentation on the CouchDB wiki and from conversation with engineers who've worked on CouchDB's replicator (Damien Katz and Filipe Manana). But don't take them as gospel.
Cloudant have documented Couchdb's replication algorithm here : http://www.replication.io/spec
From my limited testing, the performance bottleneck in the current algorithm seems to be in fetching the new revisions from the source.
That's not where our bottleneck is right now. This is definitely a concern - fetching 1000s of docs one by one is not good. But the far bigger bottleneck for us currently is the filter.
The current proof of concept is in these two branches:
Needs more testing and profiling with a full db but it seems to work. I have to put back the initial replication feature as well as migration to the remote db. I'll work on this tomorrow.
Huh @garethbowen interesting twist! So instead of restricting the changes feed by id you pull the entire thing and then filter in node. That's a nice way of making sure you get to see all deletions without having to do two queries!
(Looks good so far :-) )
Yeah. It's not the way I intended to go, but we have to iterate through the changes feed for the deleted items anyway, so I think it makes sense to do both at the same time.
This could be useful: https://github.com/pouchdb/pouchdb/issues/5263
@SCdF Two PRs created for this: https://github.com/medic/medic-webapp/pull/2347 and https://github.com/medic/medic-api/pull/75
Performance seems good locally. I want to merge to develop so I can update a production clone and test with real data, real hardware, real OS, and a remote server. Further changes or a complete revert may be necessary...
@garethbowen looks good dude, give it a crack!
Merged. Next step is to test it in a production like environment. Then clean up filters and other code if it works well.
Depending on how this test affects system load (ie if everything goes to plan it should massive reduce it) we may also want to look at increasing our costs slightly to two servers:
With master->master replication between each of them. Currently (in PROD) that app is effectively non-functional because too many requests timeout, presumably due to replication load.
We wouldn't even need any code changes to really do this, we should be able to simply have another server, medic-api, url, the works, and have that as the one that admins / non-replicating people log in with. We would then set up live two way (unfiltered) replication between those two servers.
Ok, it seems to work. Assigning to @SCdF for a secondary code review. Sorry about the many little changes but it was the easiest way to get it running in a production like environment. You can try it out on lg-testing if you like.
OK I've read the code and I'm reasonably sure I get the gist of it.
My only major concern is whether or not this affects the request size from CHWs in a meaningful way, and what that does to their bill. If most IDs are UUIDs (32-36 characters, plus commas and what-not) that would make 1000 ids ~30-40kb, which isn't an unreasonable number to expect. I presume compression sucks over a string of UUIDs (randomness and all that).
30kb is a small number, and I've only had one coffee so far today, but:
Yes that was my biggest concern too. It does batch them together (up to 100) where possible, for example, if you've been offline for a while. But yes in the worst case the list of IDs will be sent for every document synced. This could be tweaked to only sync once an hour or similar which would optimise the batching at the expense of immediacy of data, but I'm comfortable with the way it is for now.
Yeah, so perhaps instead of continuous replication, we could replace that with:
fallback(n)
seconds and try again (aka how continuous tries to connect now)GOTO 10
I don't really want phones to be potentially 6 hours out of date. And if N is too small then you may actually increase data usage by making lots of requests with an empty result.
I don't think the current patch will increase the data costs noticeably.
Oh yeah, @garethbowen I missed this one: https://github.com/medic/medic-api/blob/develop/handlers/changes.js#L62
Does that actually work?
Define "work"? I haven't actually tested it... but that block should only be hit if someone manually crafts an invalid request, eg: they've added IDs they're not allowed to see, for which documents exist that aren't deleted.
I imagine it'll blow up horribly because it's not JSON but blowing up horribly is pretty much the plan.
As in, I presume you're trying to generate a 403? Does just writing "forbidden" actually work? Or should you be doing something more like this: https://github.com/medic/medic-api/blob/develop/handlers/changes.js#L78
Ahh yes, well, no this code doesn't send a 403 response. This is because of the streaming nature of continuous feeds. By the time this code executes we have probably already sent headers to the client and therefore cannot set any more, like setting a status code.
So you'll just get a 200 with "Forbidden" and it's not very semantic but then nobody legitimate should be seeing it anyway...
It now take about 1 minute to fully load the app for a brand new user, which is the typical use case, and about 3.5 minutes to fully load the app with a fair amount of data, which is a good benchmark for users with migrated data. This is acceptable performance, though I think we should look further into how the amount of data relates to the loading speed as users with lots of data could encounter very slow speeds, particularly when reloading or opening the app after a hard close. More info here: https://github.com/medic/medic-webapp/pull/2398
Replication is slower and consumes more CPU than we'd expect. Investigate why this is.