opendatahub-io-contrib / data-mesh-pattern

Data Mesh Pattern
https://opendatahub-io-contrib.github.io/data-mesh-pattern
Apache License 2.0
28 stars 15 forks source link

LDAP trino group provider #47

Open eformat opened 1 year ago

eformat commented 1 year ago

📝 Description

we need an ldap trino group provider. i don't think there is one in upstream trino (only the file group provider).

examples of a java based one:

https://github.com/eformat/trino-group-provider-ldap-ad https://github.com/arghya18/trino-group-provider-ldap-ad

starburst has this out of the box in the product - https://docs.starburst.io/latest/security/ldap-group-provider.html

perhaps we should write a qaurkus version for ourselves ?

eformat commented 1 year ago

Also two more implementations that could be worked on

-- which looks like it basically worked but was abandoned prematurely https://github.com/trinodb/trino/pull/10116/files

-- os-climate example as well https://github.com/os-climate/trino-github-group-provider

eformat commented 1 year ago

WIP - https://github.com/trinodb/trino/pull/17518

caldeirav commented 1 year ago

As discussed with the OS-Climate team in the context of the Data Exchange, centralising the permission management with an identity provider like KeyCloak is a requirement, for consistency with the API Gateway implementation. How would we integrate the group provider with the authentication on keycloak?

eformat commented 1 year ago

trino-ldap

This represents the scheme that is configured with this change.

Identity is from LDAP e.g. using FreeIPA. could be external of course e.g. ActiveDirectory using ldap for example) OpenShift Auth is configure against LDAP, with Group Sync configured and enabled. Keycloak identity brokers Openshift (see here) using oicd/oauth2 Trino uses Keycloak for web access using oidc/oath2 (brokers to openshift login) Trino uses LDAP for api/cli access onto LDAP

The group provider in Trino (see PR) syncs groups from LDAP. Trino uses ACL's in form of a json file to control catalog/schema access based on these groups (see here)

Depending on the configuration of both OpenShift group sync and Trino Group Provider, you may have the same groups, different groups, or subset of groups configured. For simplicity its likely you want to keep them consistently configured to avoid confusion in your LDAP scheme.

As a side note .. os-climate uses OIDC/OAuth2 from GitHub in place of LDAP for identity. So if LDAP is not for you .. that scheme will also work with as long as you deploy the custom trino-github-group-provider