Open GoogleCodeExporter opened 9 years ago
Another issue that concerns me is that with current design a presence of a
large group with impact performance of ALL replica services, for ALL requests,
including ones that do not need this large group at all.
I had another idea that is relatively simple to implement and would remove
limitation on the group size whatsoever.
Basically, instead of storing list of members in the AuthGroup entity inline,
store it in another entity MemberListShard() with key ==
hash('\n'.join(members)) (i.e. content addressed, read only, outside of main
Auth entity group). Then in AuthGroup itself store only references
membersShards = [MemberListShard key1, MemberListShard key2, ...].
membersShards can infact be a "hash map" (e.g. to look up presence of user A,
we need to fetch membersShards[hash("A") % 16].
Advantages of this approach:
1. Unlimited group size.
2. Group listing can be loaded "on demand" when needed, without sacrificing
current property of full consistency (since MemberListShard entities are
immutable and content addressed).
3. Group listing can be unloaded too :) E.g. Auth DB cache can implement
limited LRU cache of MemberListShard entities in memory.
4. Replication protocol would in fact run faster in most cases (though would be
a bit more complicated). It will be like this:
a) Master -> Replica: here's a list of AuthGroups with MemberListShard keys inside.
b) Replica -> Master: replica figures out what MemberListShard keys it doesn't have, and sends it to Master.
c) Master -> Replica: here's MemberListShard entities you've requested.
d) Repeat the protocol from the beginning.
This protocol has a nice property of converging even though AuthDB size can be
giant.
Original comment by vadimsh@chromium.org
on 23 Jan 2015 at 7:49
For initial implementation there can be only one MemberListShard per group
(i.e. no hash map or sharding). But it still would be lazily loaded (to avoid
fetching giant groups all the time when they are not needed).
The entity itself can have a single field members =
BytesProperty(compressed=True) with content:
"a@example.com\na@example.com\nservice:b\nbot:d" (e.g. its ok to make 'user:' a
default, but I don't think we should do anything more smarter there..).
Old MemberListShard entites can probably stay in the DB forever, since we plan
to implement "group history" anyway. Btw with content-addressed MemberListShard
the history entities would be really slim, since there's no need to copy member
lists all the time.
Original comment by vadimsh@chromium.org
on 23 Jan 2015 at 8:01
Issue 213 has been merged into this issue.
Original comment by no...@chromium.org
on 20 Feb 2015 at 4:25
Original comment by no...@chromium.org
on 20 Feb 2015 at 4:26
I did some stress testing of existing implementation.
Maximum group size seems to be ~15K members (larger groups hit datastore entity
limit). Consequences of having such large groups:
1. Random requests get stuck for ~6 sec if they happen to hit stale AuthDB
cache. 6 sec is time needed to fetch several hundred small groups, and a bunch
of really huge one. Most of the time is actually spent in python, not in RPC :(
For that reason chromium-swarm-dev took ~3 sec instead of 6, because it's
running F4 class.
2. Groups listing in UI is very slow. (because it actually fetches everything
underneath).
3. Replication procedure takes ~40 sec.
4. Instances start to see more memory pressure. I've seen significantly more
"soft memory deadline" exceptions on chrome-infra-botmap-dev (it's memory
hungry).
My conclusion so far:
1) A bunch of groups with 1K members is tolerable, but 10K is too much for
current implementation. It introduces large random (and thus annoying) delay.
UI handlers are affected the most (since no one really cares about latency of
bot APIs).
2) Moving list of members from inside AuthGroup entity to a separate read-only
entity loaded on-demand is probably enough to fix most issues.
3) Care must be taken to avoid unnecessary deserialization because of stupid
python (slow, eats ton of memory). It's probably wise to keep member list in
memory as a set of byte strings, instead of a set of namedtuples as implemented
now. Or even as a single blob with newline delimited strings.
Original comment by vadimsh@chromium.org
on 24 Feb 2015 at 1:49
Original issue reported on code.google.com by
maruel@chromium.org
on 23 Jan 2015 at 1:32