amundsen-io / amundsen

Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.
https://www.amundsen.io/amundsen/
Apache License 2.0
4.42k stars 958 forks source link

Make a connection to Active Directory to get user details #35

Closed verdan closed 1 year ago

verdan commented 5 years ago

In order to get details of user from Active Directory (and not from Neo4j), we need a connection and configurations of the AD. At the moment the user details is highly coupled with Neo4j proxy.

feng-tao commented 5 years ago

hey @verdan , what is Active Directory ? Currently Lyft has an internal service which exposes user detail. We build an extractor(private repo as it calls the API from that internal service) and use this model(https://github.com/lyft/amundsendatabuilder/blob/master/databuilder/models/user.py) to push user entity information into neo4j.

verdan commented 5 years ago

@feng-tao all of our users, groups etc are stored in AD, which we connect through LDAP. We're planning NOT to store the users' information in Atlas directly, and will use AD as a source of truth for user information. This is what we are aiming for:

Would love to hear your thoughts as well.

feng-tao commented 5 years ago

cc @jinhyukchang I don't think I have a good idea on how to support this. Ideally, Amundsen should have as less external dependency as possible(hence we pull user metadata into neo4j at Lyft).

verdan commented 5 years ago

@feng-tao plan is to have a pluggable support of AD, so people would have an option to enable that if they want.

amitasthana commented 4 years ago

I have to work on LDAP integration in Amundsen, can someone suggest how do I proceed, like a rough idea what all ingredients I need to setup LDAP with Amundsen, so that in an organisation different sets of permission and permission groups can be set and roles can be assigned to the users who log in and view the categorised data in Amundsen and access the data only authorised to them.

verdan commented 4 years ago

@amitasthana we are not using LDAP for Access Control at the moment, and will use Apache Ranger for that. However, I have implemented the LDAP connection to get the user details from AD. You can simply use that way to inject user groups etc in the user detail and then can use that groups/policies to fine tune the access. Still a lot of work in this domain I'd say.

This is the method I imeplemented to get details from LDAP. https://github.com/lyft/amundsenmetadatalibrary/blob/master/docs/configurations.md#user_detail_method-optional

ibnipun10 commented 4 years ago

@feng-tao : Ideally user information is kept in a different dbs to support single sign-on. My company have users in office 365 and would lobe to have oauth enabled in amundsen for users to signin using there office 365 account. This way, if any user leaves the org, he may not be able to login to the catalog as well

tiago-cruz-movile commented 3 years ago

Do you have some example about how to configure the ldap authentication in config.py?

I mean, after set the USER_DETAIL_METHOD = get_user_details, where I should put the string connection, user, port and etc?

verdan commented 3 years ago

@tiago-cruz-movile the way I did was to define all these things in get_user_details method.

So my config.py was something like this:


def get_user_details(user_id):
    import ldap
    ldap_user = os.environ.get('LDAP_USER')
    ldap_password = os.environ.get('LDAP_PASSWORD')

    connection = ldap. initialize...
    .....

class Config:
    USER_DETAIL_METHOD = get_user_details
Agustin1913 commented 3 years ago

Hi guys, do we have any updates about AD or AAD integration?

We are currently working on a ubuntu machine where Amundsen is deployed using Docker. The ubuntu server is joined in the domain (AD in windows server 2019). Do you know how we can integrate it?