unicef / iogt

BSD 2-Clause "Simplified" License
25 stars 36 forks source link

Suggestion: add Jinja for Superset #1387

Open cbunicef opened 2 years ago

cbunicef commented 2 years ago

As an ADMINISTRATOR I want the access of a user to data in Superset to match the access of the user to data in Wagtail. I don't want to maintain two separate sets of permissions to the data.

As an ADMINISTRATOR I want my Superset datasets to be as well-organized as possible.

Suggested solution:

Through Jinja we can use SQL templates to accomplish both of these user stories. https://superset.apache.org/docs/installation/sql-templating/#available-macros

{{ current_username() }} will allow us to user the Superset user's username (email address) in our queries, match it to the email address of their Wagtail account, and use JOIN and WHERE statements to limit their access to data. Example:

SELECT questionnaires_usersubmission_w_path.* /* cannot find a solution to drop the page_path column */
FROM (
  SELECT auth_group.name AS auth_group_name, auth_group.id AS auth_group_id, wagtailcore_grouppagepermission.permission_type AS auth_permission_type, wagtailcore_page.path AS auth_permission_path /* normal SELECT statement */
  /*SELECT auth_group.*, wagtailcore_grouppagepermission.*, wagtailcore_page.*, iogt_users_user.* for testing */
  FROM auth_group /* collect auth group id */
  INNER JOIN wagtailcore_grouppagepermission /* join in  group page permissions, which is based on a selected page id eg "1" or "568" */
  ON auth_group.id = wagtailcore_grouppagepermission.group_id
  INNER JOIN wagtailcore_page /* join in page path, which describes the tree structure to a page eg "000100010001" */
  ON wagtailcore_grouppagepermission.page_id = wagtailcore_page.id
  LEFT JOIN iogt_users_user_groups /* join in which users are members of which groups */
  ON auth_group.id = iogt_users_user_groups.group_id
  LEFT JOIN ( /* join in user information. we include only users that are active. this is done in a subclause so that password isn't passed into the main clause, even though the passwords are hashed.

  1) is this user active - implemented
  3) is the user a superuser - not yet implemented, if they are then they should have an added record showing edit permission at page id "1" or page path "0"

  we do this in a subclause so that we never pass the password field further in the main clause

*/  
    SELECT iogt_users_user.id,  iogt_users_user.is_superuser, iogt_users_user.email, iogt_users_user.is_active
    FROM iogt_users_user
    WHERE iogt_users_user.is_active = 'true'
  ) iogt_users_user
  ON iogt_users_user_groups.id = iogt_users_user.id
  WHERE iogt_users_user.email = '{{ current_username() }}' AND wagtailcore_grouppagepermission.permission_type = 'edit'
  /*
  here we filter permissions based on the individual user. we would use Jinja to bring in the Superset username for the current user, 
  which would need to match the email address in their Wagtail user account. Superset considers the current_username() when checking 
  cache for the dataset, so there should be no issues with User A receiving a cached dataset with User B's permissions.

  a few things that need to be tested or considered here -
  1) what happens if there are multiple Groups giving the user edit permissions?
  2) is there any permission beyond "edit" that should be included, eg "publish"?
  3) is there any way this could be exploited by non-permissioned users who could register in Wagtail with any given email address? my initial thought is no, because the user has to have Superset credentials as wagtailcore_locale 
  4) as mentioned above, possibly we need a way to give superusers permission to all data even if they don't have it through Group permissions

  info on Jinja - https://preset.io/blog/intro-jinja-templating-apache-superset/#pre-defined-jinja-macros-in-superset
*/
) apache_user_permission_path
LEFT JOIN ( /* now that we have collected the page path permissions for the logged in Superset user, join in the questionnaires_usersubmission dataset */
  SELECT questionnaires_usersubmission.*,wagtailcore_page.path AS page_path /* in this subclause, we first add the page path for each submission based on page id */
  FROM questionnaires_usersubmission 
  INNER JOIN wagtailcore_page 
  ON questionnaires_usersubmission.page_id = wagtailcore_page.id
) questionnaires_usersubmission_w_path
ON questionnaires_usersubmission_w_path.page_path LIKE CONCAT(apache_user_permission_path.auth_permission_path, '%')/*
  we're joining on the two page paths, which are the same format but signify different things 
  the auth_permission path is the highest level of the hierarchy the user has access to 
  the page_path is the actual hierarchy location of the page where the submission was made 
  adding % to the end of auth_permission_path and joining on LIKE allows for matching all pages BELOW the auth_permission_path in the hierarchy
  unlike '%matchterm%', this is a very quick LIKE operation, according to Wagtail documentation. unfortunately i can't locate that page at the moment
*/

{{ dataset() }} will allow us to reference a physical or virtual dataset within Superset, rather than always having to reference a Database. This would allow us to have all datasets that are created by automated dashboarding reference back to a single dataset which contains all questionnaires and includes the {{ username() }} filter for access limits.

cbunicef commented 2 years ago

Unless we have a separate solution, Jinja is required to complete by-questionnaire permissions in #861