Adambean / gitlab-ce-ldap-sync

Synchronise users and groups (including group members) from an LDAP instance with Gitlab CE (and EE in free tier) self-hosted instance(s).
Apache License 2.0
59 stars 23 forks source link

Please support mixing with non-LDAP groups and users #1

Open Natureshadow opened 5 years ago

Natureshadow commented 5 years ago

On EduGit.org, users and groups can come from different sources. One is our LDAP directory, others are omniauth identities and manually created namespaces (groups). It seems that this tool does not expect any users or groups to be there that do not come from the LDAP directory. I do not know whether it just ocmplains loudly, but would work otherwise and just ignore the other users and groups, or if it would cause other errors if I ran it without the dry run option.

It would be good if this mode of operation could be supported correctly. Itworks out in general, as GitLab handles duplicate usernames from different sources quite nicely (it adds a number to the namespace if a newly authenticating user would cause the creation of an already existing namespace). It would maybe help if this tool had an option to keep all the synced LDAP groups as subgroups of a defined parent group, to start with.

Adambean commented 5 years ago

Hi @Natureshadow,

Thank you for your interest in this tool.

As you've noticed already this tool is largely designed around LDAP being the primary source of users and groups. I built it this way as it's a fair expectation that if Gitlab is authenticating against LDAP at all it's going to be for reasons along these lines. (It would seem pretty rare to have LDAP authentication used at all in an application broader than for internal use.)

Some fairly large modifications would need to be made to have this tool work in a secondary manner as per your requirements. If you tried to use this tool in the way you like you'd encounter at least the following unexpected destructive behaviour:

Therefore definitely do not use this tool on your Gitlab instances. You can see exactly what the tool would do if it were not in dry run mode -d by also using very verbose mode -vv.

To get this tool to work in a less destructive way you suggested this would pose some challenges.

The main one is that this tool matches users solely by the LDAP user object's "uid" attribute (customisable) against a Gitlab user's username, not the Gitlab user's namespace. This is because there is not enough information available for this tool to know that this would be a mistake. LDAP user objects don't typically have a namespace in the "slug" style Gitlab has, they typically work with the UID attribute, CN attribute, or full object DN. (In your case you could easily find that existing users get linked to different LDAP users unintentionally.)

This is the same reason that if an LDAP user's "uid" is changed a new user would be made on Gitlab rather than the intended existing user renamed accordingly. This couldn't be worked around unless there as an attribute added to the LDAP schema to keep track of the Gitlab user object ID so the tool would have something permanent to match with. -- Obviously when you rename the LDAP user their UID and DN both change, so the linked external ID on the Gitlab user becomes useless. Namespaces can also be changed via Gitlab admin making that also unsuitable to match with.

I appreciate your use case, but honestly it's quite beyond the scope scope I have enough free time for. However if someone else wanted to fork the project to implement it I'd happily accept a well built merge providing the original purpose and mode of operation was preserved in the same (or exceeding) quality it is now. (Secondary mode would have to be opt-in rather than the default behaviour.)

willmmiles commented 5 years ago

Hi,

I'm taking a stab at this for my company's GitLab server, although I can't promise it'll solve all of @Natureshadow's use case. So far I'm aiming at the following approaches:

WIP is here: https://github.com/willmmiles/gitlab-ce-ldap-sync/tree/partial-sync

Any suggestions or feedback would be greatly appreciated - if you think any of this would be of general utility, I'm happy to submit a pull request.

BOW-el commented 5 years ago

Suggestion: Expanding the tool to make userNamesToIgnore and groupNamesToIgnore a regular expression could probably done fairly easy. If your Non-LDAP and your LDAP users and groups can each be uniquely separated using a regex, the tool might probably work in such side-by-side-configurations... @Adambean, what would you think?

Adambean commented 5 years ago

Regex shouldn't be hard to implement. I would have suggested that any strings in the array wrapped with / characters could be checked with preg_match() instead of a simple comparison, but / is a valid character for DNs and UIDs. (/ specifically doesn't even require escaping by \.)

I'd therefore suggest that if we're to implement Regex ignores they should be spun off to userNamesToIgnoreMatching and groupNamesToIgnoreMatching to contain an array of Regex pattern strings. -- These strings must be 100% compatible with PHP's preg_match() string $pattern parameter, so I'd expect the tool user to include / either side with the optional case-insensitive indicator "i" at the end.

brgsousa commented 1 year ago

On EduGit.org, users and groups can come from different sources. One is our LDAP directory, others are omniauth identities and manually created namespaces (groups). It seems that this tool does not expect any users or groups to be there that do not come from the LDAP directory. I do not know whether it just ocmplains loudly, but would work otherwise and just ignore the other users and groups, or if it would cause other errors if I ran it without the dry run option.

It would be good if this mode of operation could be supported correctly. Itworks out in general, as GitLab handles duplicate usernames from different sources quite nicely (it adds a number to the namespace if a newly authenticating user would cause the creation of an already existing namespace). It would maybe help if this tool had an option to keep all the synced LDAP groups as subgroups of a defined parent group, to start with.

I also would like that feature to be implemented. Just like the parameter "groupNamesToIgnore", it should exist an option like "syncOnlyTheseGroupNames". But in this option it would take into account the group inside GITLAB first and then would only lookup for members in LDAP Server. No other group in GITLAB would be checked.