open-metadata / OpenMetadata

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.
https://open-metadata.org
Apache License 2.0
5.13k stars 975 forks source link

Enable reading global profiler settings not only by admins and bots #17042

Open mgorsk1 opened 1 month ago

mgorsk1 commented 1 month ago

Is your feature request related to a problem? Please describe.

We have a feature on our platform where we enable users to execute profiling jobs themselves. We provide curated workflow template, users update table and schema name and execute ad-hoc jobs for their tables. This was working fine until we upgraded to OM 1.4.0.0, where https://github.com/open-metadata/OpenMetadata/pull/15889 was introduced. Now, regular users (not admins or bots) cannot execute profiling jobs as they get 403 error on fetching global profiler config.

Describe the solution you'd like Requesting global profiler config (get /api/v1/system/settings/profilerConfiguration) is not restricted only to admins and bots. Proposed approaches:

What's particularly interesting about aforementioned implementation is that authorizeAdminOrBot method is used only once throughout whole OM service - in said endpoint.

Describe alternatives you've considered Since we are grouping permissions using Teams/Groups (we have Teams A, B, C and we assign users to their respective teams, then we assign DatabaseSchema owner to a team. For example schema transactions from Trino service is owned by team A and all members of team A can edit metadata in transactions schema tables) we considered extending functionality of OpenMetadata with scoped bot users (so we could create bot X that would be a member of team A - this bot would inherit permissions of the team but would be treated as bot) https://github.com/open-metadata/OpenMetadata/issues/15891.

Additional context We follow shift-left paradigm, so instead of running profiling jobs within OM Airflow, we instead allow users to do this with their personal credentials and on their own desired cadence. This is very important security-wise, as our OM instance service connections cannot use accounts (NPAs) that have access to actual data so moving this responsibility to end users is our only way to get profiling data into OM.

mgorsk1 commented 1 month ago

cc @TeddyCr as you've been working on global profiler config

TeddyCr commented 1 month ago

Hello @mgorsk1 thanks for the detail, could you share a bit more about the below? I would love to understand this flow a bit better.

We have a feature on our platform where we enable users to execute profiling jobs themselves. We provide curated workflow template, users update table and schema name and execute ad-hoc jobs for their tables.

mgorsk1 commented 1 month ago

Sure, we enable our users to use om profiling workflows themselves from within their personal jupiter notebooks. In such case they connect to appropriate systems using their personal credentials and authenticate against om api using their personal jwt tokens

TeddyCr commented 1 month ago

Ok I see, what you mean. let's take a look. I'll mark it as 1.6, but we'll try to tackle it for a minor 1.5.x release.