Tracking bug and implementation plan.
Goals:
1. On GCE use GCE project service account to authenticate (via GCE metadata
service).
2. On other bots use Cloud Console service accounts (ones that need PEM file
with RSA secret).
3. Get rid of IP whitelist as a main authentication mechanism.
Challenges:
1. Swarming bootstrap code assumes IP whitelist. bot_config.py (part of
swarming bot zip) shouldn't be anonymously readable and thus bootstrap URL
should require some authentication. I'm considering generating signed bootstrap
URLs on the front page, e.g.
"""
Here's an even shorter command when you like to live dangerously:
python -c "import urllib; exec
urllib.urlopen('https://chromium-swarm.appspot.com/bootstrap?sig=.............')
.read()"
"""
where 'sig' is basically a timestamp, nonce + HMAC. Generated only if user is
logged in to the web site and has permissions to bootstrap bots.
2. oauth2client library is too heavy to include into swarming_bot.zip. Its
dependencies are ['httplib2>=0.8', 'pyasn1==0.1.7', 'pyasn1_modules==0.0.5',
'rsa==3.1.4', 'six>=1.6.1']. I hate vendoring all this code into _appengine_
app (so that it can be included into swarming_bot.zip). Instead I'll use only
'rsa' package (or even subset of it... ASN parsing stuff is not vital) and
reimplement service account protocol myself (using 'rsa' package to generate
RSA signatures). It's very simple, ~50 lines of code: generate JWT, sign it
with RSA using service account private key, send it to Google Accounts endpoint
to get back access_token.
3a. Secret key distribution. I don't think Swarming should be solving this
problem. Instead bot_config.py will grow a new callback 'get_service_account()'
that returns a path to a file with service account secret. On chrome-infra
side, Puppet can deploy the key somewhere and chromium specific bot_config.py
will return a path to it. This item depends on (1), since to get bot_config.py
in the first place, some sort of authentication is required.
3b. Relying on secret key, readable by swarming bot, is actually a regression
compared to IP whitelist :( The bot (and tasks it runs) should not be able to
copy long term credential like this.
4. Isolate client needs to know how to use OAuth2 too. I'm considering keeping
all authentication code in the Swarming bot code and "outsource" it to Isolate
client as a "service" of some sort, e.g. swarming_bot could keep current
access_token in file on disk somewhere and pass path to this file to
run_isolated.py. In the future, the service that keeps access_token in a file
on disk can be a separate thing, outside of Swarming. It can run as 'root', for
example, and thus it will solve problem (3b) to some extent.
-------------
Taking into account (2) and (3b), I will probably focus on supporting GCE
metadata method first. Basically:
1. Teach bootstrap URLs to use signature-based authentication.
2. Teach swarming bot to grab access_token from GCE metadata and use it.
3. Teach isolate client to use access_token passed from the swarming bot.
Original issue reported on code.google.com by vadimsh@chromium.org on 20 Jan 2015 at 11:58
Original issue reported on code.google.com by
vadimsh@chromium.org
on 20 Jan 2015 at 11:58