madecoste / swarming

Automatically exported from code.google.com/p/swarming
Apache License 2.0
0 stars 1 forks source link

Add TaskRequest.acl #140

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
In some cases (Chromium use case), we want select tasks to have a fully public 
ACL. That is, the task can be viewed without the user to be logged in.

Original issue reported on code.google.com by maruel@chromium.org on 13 Aug 2014 at 6:36

GoogleCodeExporter commented 9 years ago

Original comment by maruel@chromium.org on 13 Aug 2014 at 6:36

GoogleCodeExporter commented 9 years ago
Can we do a string field with a group name instead of a flag? (Or even the list 
of group names).

The task will be visible for anyone that is in the group specified.

Original comment by vadimsh@chromium.org on 13 Aug 2014 at 6:40

GoogleCodeExporter commented 9 years ago
Interesting, I thought it much as a one-off, were it's all or nothing. I agree 
compartmented view could be interesting, but I don't want it to go too complex; 
remember that /user/tasks would become very complex to implement if complex 
ACLs are being used. I prefer federating on a different server for strongly 
different ACLs, at the cost of using different pools but that's also a natural 
safety measure.

Original comment by maruel@chromium.org on 13 Aug 2014 at 6:43

GoogleCodeExporter commented 9 years ago
(and for anonymous access we have a group * that includes all everyone)

Actually, if it's not super urgent task, I'd like to take it (will be able to 
work on it after Aug 22).

Original comment by vadimsh@chromium.org on 13 Aug 2014 at 6:52

GoogleCodeExporter commented 9 years ago
Likely to be impacting isolate server too? Not sure how to deal efficiently 
with this.

Original comment by maruel@chromium.org on 27 Aug 2014 at 7:57

GoogleCodeExporter commented 9 years ago
This can get tricky quickly since tasks triggered/stuff isolated from buildbot 
don't have owners, per-se.  If the triggering/isolating is done with OAuth 
credentials then you can just restrict to that user and have the slight 
different problem of loosening the ACLs.

Original comment by kmg@chromium.org on 27 Aug 2014 at 8:26

GoogleCodeExporter commented 9 years ago
I'm altering the proposition:

ACLs groups:
- swarming-admins override task ACL. These are the AppEngine admins and can 
control ACLs.
- swarming-privileged-users override task ACL. It should be near empty. They 
have view super powers but can't admin. E.g. sysadmins.
- swarming-users can view /user/tasks, e.g. non task ACL'ed metadata.
- others can't view anything by default.

Task visibility will be split in two:
- Group ACL'ed data. It follows ACL groups.
- Task ACL'ed data. It follows TaskRequest.acl unless user is privileged or 
admin.

In particular, the task ACL could be more permissive than the group ACL. This 
mean a task with ACL=everyone could be viewed anonymously only if the user 
knows the URL directly (for example through buildbot) but the anonymous user 
still cannot view /user/tasks.

Proposal:
Anything beside the following is private ACL'ed data:
TaskRequest:
- .created_ts, .name, .authenticated, .user, .priority, .expiration_ts, .tags
Basically, everything beside .properties is considered viewable by 
swarming-users. In particular, this excludes .properties.commands, the command 
being run.

TaskResultSummary:
- .created_ts, .name, .user, .state, .try_number, .costs_usd, .cost_saved_usd, 
.deduped_from, .bot_id, .failure, .internal_failure, .durations, .started_ts, 
.completed_ts, .abandoned_ts.
In particular, this excludes .stdout_chunks, .exit code, .server_versions.

TaskRunResult:
Nothing.

The only side channel leak I'm concerned about is TaskRequest.tags, which may 
contain semi-important metadata about the task. This is still important to be 
visible to power search.

Original comment by maruel@chromium.org on 16 Jan 2015 at 4:20

GoogleCodeExporter commented 9 years ago
Few general questions:
1. What exact problem the proposal is solving, e.g. why introduce both "Group 
ACL'ed data" and "Task ACL'ed data"?
2. What is "Group ACL'ed" and "Task ACL'ed" exactly? Based on "the task ACL 
could be more permissive than the group ACL" it seems that each task can have 
its own ACL (i.e. list of group how can view it? modify it?). I presume its 
"Task ACL'ed". Who is setting it and when? What is "Group ACL'ed", how and 
where it is defined?
3. What exactly is "private ACL'ed data"? Is it viewable by who? Is non-private 
data viewable by everyone?

Original comment by vadimsh@chromium.org on 16 Jan 2015 at 7:29

GoogleCodeExporter commented 9 years ago
Some thoughts with my current understanding.

Requirements:
0. ACL system should be as simple as possible. We prefer simplicity over 
flexibility.
1. We want multiple "organizations" (groups of people) to be able run tasks on 
a same Swarming service, e.g. "chromium", "quick office", etc.
2. Each task has an ACL (somehow derived from task request properties):
  a) Who can view it.
  b) Who can modify it (cancel, etc).
  c) Who can trigger it.
3. Tasks triggered by different organizations may or may not share a slave 
pool. (If we can cook fresh VM for each task, this problem disappears, let's 
assume we can't). If organization "A" trusts organization "B", they may share 
slave pool.
4. Each slave has an ACL (somehow derived from what?):
  a) Who can view its state.
  b) Who can modify its state (quarantine, reboot, provision).

Does this sound right?

Based on this, to keep everything simple, I propose to introduce a notion of 
"project":
1. A project has a name, e.g. "chromium".
2. A project has an ACL:
  a) List of groups who can VIEW project tasks.
  b) List of groups who can TRIGGER project tasks.
  c) List of groups who can OWN project tasks (perform all possible actions).
3. A project can be assigned one or more slave pools (via some sort of static 
config on the backend side, not when triggering). A slave pool can be shared 
between multiple projects.
4. When task is triggered, it includes a project it was triggered for in 
request properties. If caller was TRIGGER permission for the project, task is 
accepted.
5. Task listing would not list tasks per project (perhaps super-admins still 
can view all tasks across all projects).
6. Bot listing is only viewable by super-admins.
7. Bot page (list of tasks ran on the bot) is filtered to show only tasks 
viewable by the current user.

So individual task ACLs are derived from:
1) Project name specified when task was triggered.
2) ACL on the backend that contains mapping {project -> VIEW|TRIGGER|OWN -> 
[list of groups]}.

Task dimensions are implicitly scoped to a slave pool assigned to the project 
(e.g. if "chromium" slave pool doesn't have OS X machines, then triggering OS X 
task would be useless). Later (if needed) more fine-grained dimension ACLs can 
be implemented.

Original comment by vadimsh@chromium.org on 16 Jan 2015 at 7:59

GoogleCodeExporter commented 9 years ago
> 5. Task listing would not list tasks per project (perhaps super-admins still 
can view all tasks across all projects).

typo: "Task listing would _now_ list tasks per project"

Also to clarify:
VIEW permission on a task is all-or-nothing. I think splicing a task into 
"public" and "private" parts can be tricky and dangerous. Is there a good 
reason to show only some info about an otherwise "private" task? 

Original comment by vadimsh@chromium.org on 16 Jan 2015 at 8:03

GoogleCodeExporter commented 9 years ago
Adding notes via offline discussions, I'd like to settle to a forced "project" 
dimension, that would have values like "project": ["Administration", "Chrome", 
...]. This dimension would replace the "pool" used by chromium. Eventually this 
dimension could be shoehorned to apply ACLs via auth_service groups. I think 
it's relatively future proof, doesn't require a "Project" entity in the DB yet, 
still leave the door open to project-FOO-read-access groups and the likes, etc.

One thing I'll do is to enforce the names to be [a-zA-Z] for future proofing it.

Original comment by maruel@chromium.org on 28 Jan 2015 at 9:45

GoogleCodeExporter commented 9 years ago
Re "project-FOO-read-access": I considered this for other stuff. It leads to 
fast grow of total number of groups. I now prefer the model of having auth 
service keep groups that reflect overall organizational structure, and each 
service implementing its own simple ACL that references the groups (e.g. stores 
group names in the datastore). For swarming case, the ACL would be, for example:

class ProjectACL(ndb.Entity):
  readers = ndb.StringProperty(repeated=True)
  writers = ndb.StringProperty(repeated=True)
  ...

where entries are group names. It allows to create "chromium-members" group 
once and reuse it across different services, without creating tons of magical 
groups like "project-chromium-read-access" that really just pollute the global 
Groups namespace. It is sort of OK to have few hardcoded magical groups (e.g. 
"isolate-access"), but hardcoded magical "derived" group names are too much 
magic, imho.

Regardless, it's a minor detail that doesn't block introduction of "project" 
dimension.

Original comment by vadimsh@chromium.org on 28 Jan 2015 at 9:54