CartoDB / cartoframes

CARTO Python package for data scientists
BSD 3-Clause "New" or "Revised" License
249 stars 63 forks source link

Feasibility of more widely supporting the Auth API #457

Open andy-esch opened 6 years ago

andy-esch commented 6 years ago

One of the downsides of the current implementation of cartoframes is the reliance on the master API key. With the introduction of the Auth API, there exists the possibility that cartoframes users can publish notebooks with their access token to a small number of resources for reading datasets or visualizing maps. While this works in some cases (e.g., with default_public for public datasets), it doesn't work if you want to provide only access to a table for the Maps API (e.g., there's an "Access Denied" error when the self.sql_client.send('SELECT unnest(current_schemas('f'))') code is run.)

One of the motivating factors for me is the recent introduction of the cartoframes.contrib.vector module which allows users to create CARTO VL maps.

I'd love to understand better what steps, if any, we can make to bring more authentication flexibility within cartoframes so user can make maps and share their access token if they wish.

@juanignaciosl, what's your view of where this stands?

juanignaciosl commented 6 years ago

First of all, I don't know what your specific problem with that query is. It works for me even without an API key:

~ $ curl -G https://juanignaciosl.carto.com/api/v2/sql --data-urlencode "q=SELECT unnest(current_schemas('f'))" --data-urlencode "api_key=XXXXXXXXXXX"
{"rows":[{"unnest":"juanignaciosl"},{"unnest":"cartodb"},{"unnest":"cdb_dataservices_client"},{"unnest":"public"}],"time":0.001,"fields":{"unnest":{"type":"name"}},"total_rows":4}

~ $ curl -G https://juanignaciosl.carto.com/api/v2/sql --data-urlencode "q=SELECT unnest(current_schemas('f'))" --data-urlencode "api_key=default_public"
{"rows":[{"unnest":"juanignaciosl"},{"unnest":"cartodb"},{"unnest":"cdb_dataservices_client"},{"unnest":"public"}],"time":0.004,"fields":{"unnest":{"type":"name"}},"total_rows":4} 

~ $ curl -G https://juanignaciosl.carto.com/api/v2/sql --data-urlencode "q=SELECT unnest(current_schemas('f'))"
{"rows":[{"unnest":"juanignaciosl"},{"unnest":"cartodb"},{"unnest":"cdb_dataservices_client"},{"unnest":"public"}],"time":0.003,"fields":{"unnest":{"type":"name"}},"total_rows":4}

That said, @luisbosque and @jorgesancha might want to talk about this, because I think that they're currently thinking about Engine authentication improvements.

andy-esch commented 6 years ago

The problem I'm experiencing: if I create an API key that's only for use with the Maps API, then I get the "Access Denied" error when running that SQL query (SELECT unnest(current_schemas('f'))).

juanignaciosl commented 6 years ago

Oh, I see. Might that be a bug, @javitonino ? I think that having keys with fewer permissions than default_ public is unexpected.

javitonino commented 6 years ago

If you create a key only for Maps API (aka, without SQL API), it won't let you access to the SQL API at all. That is expected. Yes, that means even less permissions that default_public, but it's a feature. I don't know if there are many use cases where you want access to Maps API and not SQL API, since it's hard to do anything without SQL API, but it's technically supported (aka: we might want to delete this feature xD)

andy-esch commented 6 years ago

👍 it works if I select SQL + Maps, thanks @javitonino

andy-esch commented 6 years ago

I'm going to add a section to the README about cartoframes + auth api since users will get different experiences depending on the permissions they set (including no access to Import API/CartoContext.write).

andy-esch commented 6 years ago

@javitonino, can you shed any light on this one?

I'm getting the following error when trying to use the DO with an auth token:

{"error":["Username is a mandatory argument, check it out"]}

Here's the call https://eschbacher.carto.com/api/v2/sql?q=%0AWITH+envelope+AS+%28%0ASELECT%0AST_SetSRID%28ST_Extent%28the_geom%29%3A%3Ageometry%2C+4326%29+AS+env%2C%0Acount%28%2A%29%3A%3Aint+AS+cnt%0AFROM+all_month_3%0A%29%0ASELECT+%2A%0AFROM+json_to_recordset%28%0A%28SELECT+OBS_GetMeta%28%0Aenvelope.env%2C%0A%28%27%5B%7B%22numer_id%22%3A+%22us.census.acs.B01003001%22%7D%2C+%7B%22numer_id%22%3A+%22us.census.acs.B15003017%22%2C+%22normalization%22%3A+%22predenominated%22%7D%2C+%7B%22numer_id%22%3A+%22us.census.acs.B19013001%22%7D%2C+%7B%22numer_id%22%3A+%22us.census.acs.B17001002%22%2C+%22normalization%22%3A+%22predenominated%22%7D%5D%27%29%3A%3Ajson%2C%0A10%2C+1%2C+envelope.cnt%0A%29+AS+meta%0AFROM+envelope%0AGROUP+BY+env%2C+cnt%29%29+as+data%28%0Adenom_aggregate+text%2C+denom_colname+text%2C%0Adenom_description+text%2C+denom_geomref_colname+text%2C%0Adenom_id+text%2C+denom_name+text%2C+denom_reltype+text%2C%0Adenom_t_description+text%2C+denom_tablename+text%2C%0Adenom_type+text%2C+geom_colname+text%2C%0Ageom_description+text%2Cgeom_geomref_colname+text%2C%0Ageom_id+text%2C+geom_name+text%2C+geom_t_description+text%2C%0Ageom_tablename+text%2C+geom_timespan+text%2C%0Ageom_type+text%2C+id+numeric%2C+max_score_rank+text%2C%0Amax_timespan_rank+text%2C+normalization+text%2C+num_geoms%0Anumeric%2Cnumer_aggregate+text%2C+numer_colname+text%2C%0Anumer_description+text%2C+numer_geomref_colname+text%2C%0Anumer_id+text%2C+numer_name+text%2C+numer_t_description%0Atext%2C+numer_tablename+text%2C+numer_timespan+text%2C%0Anumer_type+text%2C+score+numeric%2C+score_rank+numeric%2C%0Ascore_rownum+numeric%2C+suggested_name+text%2C%0Atarget_area+text%2C+target_geoms+text%2C+timespan_rank%0Anumeric%2C+timespan_rownum+numeric%29%0A&api_key=9BOX4GJow_sC6jlIcT4nCA

I'm not surprised it doesn't work, but I'm trying to consistently catch cases where the Auth API won't work so users aren't confused by cartoframes' behavior.

javitonino commented 6 years ago

So, the error comes from dataservices-api. So it's normal that it doesn't work in the sense that we probably don't want non-master API Keys running dataservices queries (although this is probably something we want to allow as an option).

If you run this same thing with the public API Key, it shows a better error: The api_key must be provided. The code has a match for public users in order to do this, here: https://github.com/CartoDB/dataservices-api/blob/6180b0052550b28fae678c68418ddeba48286c1b/client/renderer/templates/20_public_functions.erb#L13-L20. But if the role is unknown (like a api key role), it throws the error you see. So, couple of things:

But anyway, for cartoframes, I think you can treat this error in the same way as the no-api-key case, and show an error that observatory is only supported with master api keys.

andy-esch commented 6 years ago

❤️ thank you :)

Good idea on the error handling of this. One problem I see on my side is the HTTP status code comes up as 400 instead of 'not authorized' or something similar. What's the reasoning behind returning a 400 code here?

We may want to improve the error messaging in this case

Happy to help here if you'd like help

Do we want to allow non-master keys to run dataservices stuff?

It could be a cool feature. I could imagine setting up a key to give someone besides the account owner access to your data services quota as well as write access to a table, so the second party could do some work on a table in that person's account. This would give someone access to do some work on an account while allow the rest of the account's data to remain private.

javitonino commented 6 years ago

What's the reasoning behind returning a 400 code here?

Probably because when this was done, we never thought we've ever implement something like Auth API, haha. @ethervoid might know more, but I really don't think there is a good reason here. Maybe we can just change the HTTP return code, or be a bit clever and try to detect api key roles (format is {username}_role_{random_hex})

ethervoid commented 6 years ago

The problem is that the user doesn't have cdb_conf->>entity data or access to the cdb_conf table where the username is stored and the flag to know if it belongs to an organization or not.

The error code is correct because you're passing wrong values to the function, not passing/having the username, is not a problem of the API Key is a problem of permissions or faulty data.