google-code-export / s3ql

Automatically exported from code.google.com/p/s3ql
0 stars 0 forks source link

Consider supporting Google Storage API v2 with OAuth2 authentication #408

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Google Storage supports OAuth2 authentication for its APIs, eliminating the 
need to enable legacy access support and developer keys 
(https://developers.google.com/accounts/docs/OAuth2, 
https://developers.google.com/storage/docs/authentication). It may be good to 
support this in S3ql in the interest of security and forwards-compatibility.

The boto OAuth2 plugin bundled with Google's gsutil tool code may offer some 
insight into what is needed to implement this in S3ql 
(https://developers.google.com/storage/docs/gsutil_install).

Original issue reported on code.google.com by nicola.c...@gmail.com on 23 Jul 2013 at 12:27

GoogleCodeExporter commented 9 years ago
I looked over the OAuth documentation, but I still don't fully understand why 
using it for S3QL would be an advantage.

As I understand it, OAuth is great if you want to give a third party webservice 
partial access to your Google account. However, I don't quite see the point of 
using it for a local application. Quite in contrast, to me it seems like a step 
back if mount.s3ql would need to fork a browser(!) where you have to enter your 
credentials. That's a lot of inconvenince for no perceptible gain (if you do 
not trust S3QL with your account, you have already lost at this point because 
as a local application S3QL could easily capture whatever credentials you enter 
in the  browser window).

Or did I misunderstand something about OAuth?

Original comment by Nikolaus@rath.org on 23 Jul 2013 at 3:49

GoogleCodeExporter commented 9 years ago
There are a couple reasons why OAuth2 may be a useful option. In this context, 
I can only speak about access to Google Storage, as I don't know what other 
backends do for OAuth2 support, if anything.

In general, one reason for wanting to allow support to OAuth2 is simply the 
fact that it's not tied to Google Storage API v1 and the recommendation in 
Google's own Storage developers documentation encourages the use of API v2 
(https://developers.google.com/storage/docs/reference/v1/apiversion1):

"Google Cloud Storage XML API version 1.0 remains available but we recommend 
that you transition to API v2.0 as it is more secure and offers more support 
for working with projects.

API v1.0 currently still supports HMAC authentication and developer keys and 
interoperability with some tools written for other cloud storage systems, but 
if you do not require this, we recommend that you transition to using API v2.0 
instead."

An argument could be made that S3ql can be considered in that category of tools 
written for other cloud storage systems, and if there are no concrete plans by 
Google to drop API v1 yet, there may be no need to support API v2. That's fair 
enough, though of course that situation may change in the future.

About OAuth2 (https://developers.google.com/storage/docs/authentication):

"Google recommends OAuth 2.0 authentication for interacting with the Google 
Cloud Storage API. OAuth 2.0 authentication eliminates the need to provide 
sensitive account information, such as a username and password, to any 
applications that need access to your data. Instead, you can use the OAuth 2.0 
protocol to obtain and give out OAuth tokens. OAuth tokens authenticate tools 
and applications to access Google Cloud Storage API on your behalf and also 
provides the ability to restrict access using scopes. You can authorize 
different applications with separate tokens, and revoke tokens individually, if 
necessary."

The three scopes defined (read-only, read-write, or full-control), may not be 
too useful for S3ql as I imagine it would mostly (only?) use the read-write 
scope for normal operation. However the ability to revoke tokens individually 
may become useful in a scenario where S3ql is used from several locations at 
once (never accessing the same bucket/prefix pair of course, but nevertheless 
accessing the same Google Storage project) and it becomes necessary to revoke 
access to one or some of those locations, but not all. The hashed developer 
keys from API v1 work fine, but define an all-or-nothing type of access and 
must be shared among clients, and furthermore Google Storage allows creation of 
only up to five key pairs 
(https://developers.google.com/storage/docs/reference/v1/getting-startedv1#keys 
-- as opposed to, I believe, any number of access tokens for OAuth2 
applications -- citation needed, will try to find out if there's a cap).

As I understand it there is still a certain degree of initial setup needed when 
using legacy hashed keys -- one needs to login as an owner into a Google 
Storage web console to generate suitable keys, and copy them to the S3ql 
authfile as backend-login and backend-passphrase. When switching to OAuth2 it 
is my understanding things would be similar: one would have to do an initial 
login and register an application with Google Storage, obtaining an access 
token at the end of the process which allows scoped access. In addition to 
that, an application may also obtain a refresh token, which is longer-lived and 
can be stored to request more access tokens later; this refresh token would be 
what gets stored in the S3ql authfile. Google's gsutil tool does something 
similar via its 'config' command and an oauth2-enabled version of the boto 
library (https://developers.google.com/storage/docs/gsutil_install#oauth2), but 
instead of spawning a browser it generates a URL, inviting the user to visit it 
to register the application and afterwards input the resulting authorization 
code at the waiting console prompt. I agree it may be a bit cumbersome, but 
nevertheless once the refresh token is stored in the boto configuration file 
(or, S3ql authfile) this process need not be repeated for that application 
(i.e., that instance of S3ql).

Doing this setup can be left to the user like the legacy hashed key setup is 
currently, and using OAuth2 can be an option, not necessarily a requirement. I 
do realize though it would take a bit of effort to implement this, so I'm fine 
with its current status of wish list item at the moment, but I would like to 
submit it for consideration.

Original comment by nicola.c...@gmail.com on 24 Jul 2013 at 1:00

GoogleCodeExporter commented 9 years ago
So it seems to me that the only advantage of OAuth is that you are not 
restricted to 5 key pairs and can give more fined grained access (though for 
real security that means you'd have to generate the tokens on a different 
machine than the one that runs S3QL).

On the minus side, you are forced to use SSL, and you need to install gsutil 
(or some other program) to obtain the OAuth token.

The later could probably be avoided if Google supported Resource Owner Password 
Credentials in its implementation (cf. 
http://tools.ietf.org/html/draft-ietf-oauth-v2-22#section-1.3.3), but I haven't 
found any information about that.

All in all, I'm not convinced that implementing this would be time well spent. 
But if you want to submit a patch, I'll probably accept it.

Original comment by Nikolaus@rath.org on 24 Jul 2013 at 2:51

GoogleCodeExporter commented 9 years ago
I understand. Time permitting, I'll see what I can do.

Original comment by nicola.c...@gmail.com on 24 Jul 2013 at 5:01

GoogleCodeExporter commented 9 years ago

Original comment by Nikolaus@rath.org on 22 Apr 2014 at 2:55

GoogleCodeExporter commented 9 years ago
Fixed in 
https://bitbucket.org/nikratio/s3ql/commits/47298c473ceb31e5490d0b5cf4982e274937
b29e

Original comment by Nikolaus@rath.org on 26 Apr 2014 at 9:45