Closed bendnorman closed 1 month ago
Also, in the com dev meeting we decided doing superset demos during the required interviews doesn’t make a ton a sense so our July 17th deadline is moot. Do we still want to aim to have a public beta by the end of the month?
Superset provides a few predefined roles: Admin, Alpha, Gamma, Public and sql_lab. We want people who register to assume the Gamma role because it has the fewest number of permissions. By default they don't have access to databases and SQL lab so we need to create a new role. To do this I exported the existing roles as a json, created a new role called GammaSQLLab with Gamma and sql_lab permissions. I also needed to add the all_database_access
permissions so it could access the database. This role can query the duckdb database, create charts and dashboards but can't edit or add new data sources.
You can export and import roles using these commands in the superset container:
superset fab export-roles --path {path to write role to}.json
superset fab import-roles --path {path to edited roles}.json
Once the GammaSQLLab role is created, we can set it as the default registration role in config_supserset.py
:
AUTH_USER_REGISTRATION_ROLE = "GammaSQLLab"
This is our desired registration and permissions workflow:
The correct way to do this is to assign users to oauth groups and map these groups to superset roles. I can't figure out how to create and manager groups using auth0. I tried using this Authorization extension but didn't get very far. I'm sure there is a way to make this work but it's above my pay grade.
I figured out a work around for now. We can create a superset-admin@catalyst.coop
user in the auth0 User Management tab. Then, we can create a new superset admin user inside the container with this command:
superset fab create-admin \
--username 'auth0_auth0|{user_id generated by auth0}' \
--firstname {Superset} \
--lastname {Admin} \
--email {email} \
--password {password created in auth0}
Now we can log into superset with this admin account and give catalyst accounts admin or alpha permissions.
Now that the auth0 and permissions stuff is mostly working I'm going to move onto the hosting infrastructure.
@bendnorman it sounds reasonable to me to extend the timeline. Seems like a good thing to discuss during inframundo sprint planning today.
http://data.catalyst.coop
for Datasette because it was too generic -- like anything could be at that destination. PUDL is all data. https://superset.catalyst.coop
would be more obviously specific to this project.Some notes on usage data we have access to:
The most valuable information tracked in the superset database is the queries and registered users. The query
table in the superset database contains all user queries and the user id. We can use this to see what types of queries people are doing and which tables people are accessing. Here is a query to access this information:
SELECT
users.first_name,
users.last_name,
queries.*
FROM
"public"."query" AS queries
LEFT JOIN
"public"."ab_user" AS users
ON
users.id = queries.user_id
ORDER BY start_time DESC
LIMIT 1000;
I'm not sure if it's possible to track how long people spend on the site. Any thoughts @jdangerx?
We should also probably save all the Cloud Run logs to bigquery or cloud storage using a cloud sink. By default, logs are only saved for 30 days. I think these logs will be helpful for debugging cloud run failures, tracking cloud run resource use and total downloads.
Also @jdangerx, the auth0 screen has a warning about using development OAuth keys in production. I wonder if this is related the HTTP redirect issue we're having.