madecoste / swarming

Automatically exported from code.google.com/p/swarming
Apache License 2.0
0 stars 1 forks source link

Service account authentication for Swarming bots #200

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Tracking bug and implementation plan.

Goals:
1. On GCE use GCE project service account to authenticate (via GCE metadata 
service).
2. On other bots use Cloud Console service accounts (ones that need PEM file 
with RSA secret).
3. Get rid of IP whitelist as a main authentication mechanism.

Challenges:
1. Swarming bootstrap code assumes IP whitelist. bot_config.py (part of 
swarming bot zip) shouldn't be anonymously readable and thus bootstrap URL 
should require some authentication. I'm considering generating signed bootstrap 
URLs on the front page, e.g. 

"""
Here's an even shorter command when you like to live dangerously:
python -c "import urllib; exec 
urllib.urlopen('https://chromium-swarm.appspot.com/bootstrap?sig=.............')
.read()"
"""

where 'sig' is basically a timestamp, nonce + HMAC. Generated only if user is 
logged in to the web site and has permissions to bootstrap bots.

2. oauth2client library is too heavy to include into swarming_bot.zip. Its 
dependencies are ['httplib2>=0.8', 'pyasn1==0.1.7', 'pyasn1_modules==0.0.5', 
'rsa==3.1.4', 'six>=1.6.1']. I hate vendoring all this code into _appengine_ 
app (so that it can be included into swarming_bot.zip). Instead I'll use only 
'rsa' package (or even subset of it... ASN parsing stuff is not vital) and 
reimplement service account protocol myself (using 'rsa' package to generate 
RSA signatures). It's very simple, ~50 lines of code: generate JWT, sign it 
with RSA using service account private key, send it to Google Accounts endpoint 
to get back access_token.

3a. Secret key distribution. I don't think Swarming should be solving this 
problem. Instead bot_config.py will grow a new callback 'get_service_account()' 
that returns a path to a file with service account secret. On chrome-infra 
side, Puppet can deploy the key somewhere and chromium specific bot_config.py 
will return a path to it. This item depends on (1), since to get bot_config.py 
in the first place, some sort of authentication is required.

3b. Relying on secret key, readable by swarming bot, is actually a regression 
compared to IP whitelist :( The bot (and tasks it runs) should not be able to 
copy long term credential like this. 

4. Isolate client needs to know how to use OAuth2 too. I'm considering keeping 
all authentication code in the Swarming bot code and "outsource" it to Isolate 
client as a "service" of some sort, e.g. swarming_bot could keep current 
access_token in file on disk somewhere and pass path to this file to 
run_isolated.py. In the future, the service that keeps access_token in a file 
on disk can be a separate thing, outside of Swarming. It can run as 'root', for 
example, and thus it will solve problem (3b) to some extent.

-------------

Taking into account (2) and (3b), I will probably focus on supporting GCE 
metadata method first. Basically:
1. Teach bootstrap URLs to use signature-based authentication.
2. Teach swarming bot to grab access_token from GCE metadata and use it.
3. Teach isolate client to use access_token passed from the swarming bot.

Original issue reported on code.google.com by vadimsh@chromium.org on 20 Jan 2015 at 11:58