cagov / caldata-mdsa-caltrans-pems

CalData's MDSA project with Caltrans on Performance Measurement System (PeMS) data
https://cagov.github.io/caldata-mdsa-caltrans-pems/
MIT License
3 stars 0 forks source link

Read-Only Service Account(s) for BI Tools #300

Closed kengodleskidot closed 1 month ago

kengodleskidot commented 1 month ago

@ian-r-rose and @britt-allen as we get ready to start connecting BI tools (PowerBI and ArcGIS) to data in Snowflake and an S3 bucket, what is the best way to manage the accounts that access these data sources? In the past we have created read-only service accounts that staff could use to connect to data using BI tools. This reduced the need to manage individual user accounts and the need to update credentials assocaited with various data connections when staff moved around. Are there best practices we should follow regarding the management of accounts used to access data for reporting and dashboard tools?

ian-r-rose commented 1 month ago

Great question @kengodleskidot.

We've been discussing two different access patterns for the data here: S3 and Snowflake. For S3, we should be able to use direct URLs without authentication, because we intend for that bucket to be public.

For Snowflake, it's typically considered best practices for users to use their own Snowflake credentials if they have them. For users who don't have Snowflake credentials, yes, we usually use service accounts. The pattern for data access in Snowflake is already baked in to our project architecture: the idea is that BI-ready marts are built in the ANALYTICS database (rather than TRANSFORM, where we have been building our intermediate models). We already have a REPORTER role that has read but not write access to tables in the ANALYTICS database.

I'll give a tour of this architecture for everyone on Thursday, but you can read about it more detail in some of our team's public-facing docs.

jkarpen commented 1 month ago

Closing this as the discussion is complete.