Confirm auth_service able to support groups with ten thousand users

GoogleCodeExporter commented 9 years ago

It is unknown if auth_service will correctly handle relativel large groups, 
which could become needed relatively soon.

https://code.google.com/p/swarming/source/browse/appengine/components/components
/auth/model.py
Entity AuthGroup would become large, could exceed the 1Mb limit. One workaround 
is to use a LocalStructuredProperty(compressed=True) since the identities are 
very compressible.

IdentityProperty is relatively small itself, a user will be 
"user:foobar@google.com". So 16 bytes of *constant overhead* and N bytes of 
actual data, where N is between 1 and ~12 with an estimated average of ~6. We 
can basically assume the 16 bytes will completely be compressed away, so 
compression should be very effective. Still, this means the AppEngine instances 
needs to keep the uncompressed data in memory so this needs to be take in 
account. 10000 accounts with 20 bytes of memory usage (that's optimistic) gives 
~200kb of memory usage. More realistically it would be around 2Mb of memory, 
that's python after all. Still, the entity would likely still be under 1Mb but 
there would probably not be enough space for 20000 users. The other cost is 
serialization speed for replication that needs to be confirmed.

Taking in account we currently use instances with 256Mb or 512Mb of RAM, we 
should be fine on that side with this overhead.

Original issue reported on code.google.com by maruel@chromium.org on 23 Jan 2015 at 1:32

GoogleCodeExporter commented 9 years ago

Another issue that concerns me is that with current design a presence of a 
large group with impact performance of ALL replica services, for ALL requests, 
including ones that do not need this large group at all.

I had another idea that is relatively simple to implement and would remove 
limitation on the group size whatsoever.

Basically, instead of storing list of members in the AuthGroup entity inline, 
store it in another entity MemberListShard() with key == 
hash('\n'.join(members)) (i.e. content addressed, read only, outside of main 
Auth entity group). Then in AuthGroup itself store only references 
membersShards = [MemberListShard key1, MemberListShard key2, ...]. 
membersShards can infact be a "hash map" (e.g. to look up presence of user A, 
we need to fetch membersShards[hash("A") % 16].

Advantages of this approach:
1. Unlimited group size.
2. Group listing can be loaded "on demand" when needed, without sacrificing 
current property of full consistency (since MemberListShard entities are 
immutable and content addressed).
3. Group listing can be unloaded too :) E.g. Auth DB cache can implement 
limited LRU cache of MemberListShard entities in memory.
4. Replication protocol would in fact run faster in most cases (though would be 
a bit more complicated). It will be like this:
  a) Master -> Replica: here's a list of AuthGroups with MemberListShard keys inside.
  b) Replica -> Master: replica figures out what MemberListShard keys it doesn't have, and sends it to Master.
  c) Master -> Replica: here's MemberListShard entities you've requested.
  d) Repeat the protocol from the beginning.
This protocol has a nice property of converging even though AuthDB size can be 
giant.

Original comment by vadimsh@chromium.org on 23 Jan 2015 at 7:49

GoogleCodeExporter commented 9 years ago

For initial implementation there can be only one MemberListShard per group 
(i.e. no hash map or sharding). But it still would be lazily loaded (to avoid 
fetching giant groups all the time when they are not needed).

The entity itself can have a single field members = 
BytesProperty(compressed=True) with content: 
"a@example.com\na@example.com\nservice:b\nbot:d" (e.g. its ok to make 'user:' a 
default, but I don't think we should do anything more smarter there..).

Old MemberListShard entites can probably stay in the DB forever, since we plan 
to implement "group history" anyway. Btw with content-addressed MemberListShard 
the history entities would be really slim, since there's no need to copy member 
lists all the time.

Original comment by vadimsh@chromium.org on 23 Jan 2015 at 8:01

GoogleCodeExporter commented 9 years ago

Issue 213 has been merged into this issue.

Original comment by no...@chromium.org on 20 Feb 2015 at 4:25

GoogleCodeExporter commented 9 years ago

Original comment by no...@chromium.org on 20 Feb 2015 at 4:26

GoogleCodeExporter commented 9 years ago

I did some stress testing of existing implementation.

Maximum group size seems to be ~15K members (larger groups hit datastore entity 
limit). Consequences of having such large groups:
1. Random requests get stuck for ~6 sec if they happen to hit stale AuthDB 
cache. 6 sec is time needed to fetch several hundred small groups, and a bunch 
of really huge one. Most of the time is actually spent in python, not in RPC :( 
For that reason chromium-swarm-dev took ~3 sec instead of 6, because it's 
running F4 class.
2. Groups listing in UI is very slow. (because it actually fetches everything 
underneath).
3. Replication procedure takes ~40 sec.
4. Instances start to see more memory pressure. I've seen significantly more 
"soft memory deadline" exceptions on chrome-infra-botmap-dev (it's memory 
hungry).

My conclusion so far:
1) A bunch of groups with 1K members is tolerable, but 10K is too much for 
current implementation. It introduces large random (and thus annoying) delay. 
UI handlers are affected the most (since no one really cares about latency of 
bot APIs).
2) Moving list of members from inside AuthGroup entity to a separate read-only 
entity loaded on-demand is probably enough to fix most issues. 
3) Care must be taken to avoid unnecessary deserialization because of stupid 
python (slow, eats ton of memory). It's probably wise to keep member list in 
memory as a set of byte strings, instead of a set of namedtuples as implemented 
now. Or even as a single blob with newline delimited strings.

Original comment by vadimsh@chromium.org on 24 Feb 2015 at 1:49

madecoste / swarming

Confirm auth_service able to support groups with ten thousand users #203