peterknife / boto

Automatically exported from code.google.com/p/boto
0 stars 0 forks source link

boto.s3.Key.add_user_grant() adds redundant grants #377

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Hey, I'm a programmer at Yelp working a to-be-open-sourced framework to
make it easy to run map-reduce jobs on EMR (including copying over your
source tree and installing libraries.)

So, S3 apparently allows us to add several copies of the same grant to the
same user in an ACL. I didn't know this when using add_user_grant(), and
managed to slowly fill up the ACL for a key by granting read access on it
several different times.

Here's some code which will reproduce the bug:

from boto import s3
conn = s3.Connection(...) # fill with your credentials
my_id = conn.get_canonical_user_id()
bucket = conn.get_bucket(...) # your bucket name
key = bucket.new_key('i_love_acls')
key.set_contents_from_string('')
while True:
    print key.get_acl()
    key.add_user_grant('READ', my_id)

So, some of this is S3's fault, including the cryptic error message you get
once the ACL fills up (it just says the request violates its XML schema).

However, since add_user_grant() is part of the "simple" interface to ACLs,
I'd expect it to protect against adding the same permission several times.
Since add_user_grant() apparently has to download the ACL and rewrite it
anyway, this shouldn't impact performance.

As a bonus, we could add a more useful error error message when the ACL is
full.

(Observed this with boto version 1.9b)

Original issue reported on code.google.com by dave%yel...@gtempaccount.com on 20 May 2010 at 11:32

GoogleCodeExporter commented 9 years ago
Yes, I was aware of that behavior and have considered changing it.  I think the 
"right" solution would be for S3 
to do a better job of handling it.  The only real downside to handling it in 
boto is the extra roundtrip required 
to download the ACL first to do the check.  I think that's why I didn't include 
it in the first place but you have a 
good point.  It's not really fair to just push the whole problem onto the end 
user.

I'll make the change in the next day or two.

Regarding the error message, my basic philosophy has been not to try to 
reinterpret AWS errors but to just 
pass them on as is.  I've actually never seen that error.  I'll try to 
reproduce it and see if there is something we 
can do to improve it.

Original comment by Mitch.Ga...@gmail.com on 21 May 2010 at 12:23

GoogleCodeExporter commented 9 years ago
Ah, it's all coming back to me now 8^)

The reason that I didn't implement this before is that I couldn't figure out a 
way to do it.  If you look at the 
XML version of the ACL that is returned by S3, all of the users are described 
in the Canonical User format, like 
this:

<Grant><Grantee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:type="CanonicalUser"><ID>45b449698a4d5769811e0ce97b248dd75e043fb861c72d2bc9e
ed9bd6e1d4c
76</ID><DisplayName>info</DisplayName></Grantee><Permission>READ</Permission></G
rant>

So, now let's say that someone wants to do a grant via an email address or a 
user id.  How do you map that 
email address or user id to a CanonicalUser id?  There is a method in the 
S3Connection class that can provide 
the CanonicalUser id for the currently authenticated user but I don't know how 
to do that for an arbitrary user.

Original comment by Mitch.Ga...@gmail.com on 25 May 2010 at 12:27

GoogleCodeExporter commented 9 years ago
Ah, interesting. Would it be possible to suppress duplicates for canonical user 
IDs
at least?

Also, would it work to remove any current duplicate entries from the XML 
returned by
S3 before we send it back? That way, we could have at most one duplicate entry 
at a
time in the ACL.

Original comment by dave%yel...@gtempaccount.com on 25 May 2010 at 5:58

GoogleCodeExporter commented 9 years ago
Issue 99 has been merged into this issue.

Original comment by Mitch.Ga...@gmail.com on 19 Jun 2010 at 1:37

GoogleCodeExporter commented 9 years ago
Just ran in to this one myself doing add_email_grant. The error message is 
definitely cryptic:

> <Error><Code>MalformedACLError</Code><Message>The XML you provided was not 
well-formed or did not validate against our published 
schema</Message><RequestId>D1385B7F82EC90EF</RequestId><HostId>7TVgUzb1b7DAile/P
xzrtKcECiqY2BAzkh94Hrg0nXcveCkFRf9YXtWu7DbYf2Ky</HostId></Error>

The email to username conversion seems to make it so you can't solve this 
perfectly, but as the original ticketer suggested, at least cleaning out the 
dupes on each new grant request would mitigate the problem by quite a lot. A 
max of 2 dupes is a lot better than the 98 I have in my ACL :) Would definitely 
be preferable if S3 solved this problem itself, though.

Original comment by winhamwr on 30 Jul 2010 at 9:00