Incorrect handling for LDAP responses with Search Reference Results.

franciscaestecker commented 3 weeks ago

The recent change where response handling is based on the length of the LDAP responses, doesn't take into account alternative responses, such as Search Reference Result (https://docs.ldap.com/ldap-sdk/docs/javadoc/com/unboundid/ldap/sdk/SearchResultReference.html)

For example, I have an LDAP server that always returns a SearchRefResult as part of the response. So if no user is found, the response is length 1. If it does find a username, the response is length 2.

I currently don't have a work around for this. This worked fine in previous versions.

manics commented 3 weeks ago

Please can you turn on debug logging in JupyterHub and show us your full logs?

Can you also give us some example responses from your LDAP server including the SearchRefResult?

franciscaestecker commented 3 weeks ago

Bit struggling to, as I don't have access to GitHub in the environment that shows the error. Will see what I can do.

franciscaestecker-bb commented 3 weeks ago

Code to replicate

import asyncio
import getpass

from traitlets.config import Config
from ldapauthenticator import LDAPAuthenticator

c = Config()

c.LDAPAuthenticator.server_address = "***"
c.LDAPAuthenticator.tls_strategy="before_bind"

c.LDAPAuthenticator.user_search_base = "dc=***,dc=***,dc=***"
c.LDAPAuthenticator.user_attribute = 'sAMAccountName'

c.LDAPAuthenticator.lookup_dn = True
c.LDAPAuthenticator.lookup_dn_user_dn_attribute = 'sAMAccountName'
c.LDAPAuthenticator.lookup_dn_search_user = '***'
c.LDAPAuthenticator.lookup_dn_search_password = '***'

# Run test

authenticator = LDAPAuthenticator(config=c)
username = input("Username: ")
password = getpass.getpass()
data = dict(username=username, password=password)
return_value = asyncio.run(authenticator.authenticate(None, data))

Logs

2024-10-30 13:25:23,668 [DEBUG] Attempting to bind ***
2024-10-30 13:25:24,686 [DEBUG] Successfully bound ***
2024-10-30 13:25:24,686 [DEBUG] Looking up user with:
    search_base = 'dc=***,dc=***,dc=***'
    search_filter = '(sAMAccountName=USERNAME)'
    attributes = '[sAMAccountName]'
2024-10-30 13:25:29,677 [WARNING] Failed to lookup a unique DN for username 'USERNAME'

LDAP search_filter response

[{'raw_dn': b'CN=Foo, Bar,OU=Users,OU=***,OU=***,OU=***,DC=***,DC=***,DC=***',
  'dn': 'CN=Foo, Bar,OU=Users,OU=***,OU=***,OU=***,DC=***,DC=***,DC=***',
  'raw_attributes': {'sAMAccountName': [b'USERNAME']}
  'attributes': {},
  'type': 'searchResEntry'},
 {'uri': ['ldap://***/DC=***,DC=***,DC=***,DC=***'],
  'type': 'searchResRef'}]

manics commented 3 weeks ago

Thanks. The error is generated here https://github.com/jupyterhub/ldapauthenticator/blob/7638a6e65754ee6103586c9dd5696b27ba730c9b/ldapauthenticator/ldapauthenticator.py#L464-L467

so we can probably add a configurable filter to select a response from the list:

this could be a fully configurable hook/callback i.e. response = user_defined_hook(responses) where you define user_defined_hook yourself
If there's always going to be a fixed attribute value to identify the required item (in your case 'type': 'searchResEntry') we could maybe have a dict of fields that we match against?

consideRatio commented 3 weeks ago

This is https://github.com/jupyterhub/ldapauthenticator/issues/292#issuecomment-2447175706 but formatted, it could be that this formatting is disabled because it was an email response comment?

Code to replicate

import asyncio
import getpass

from traitlets.config import Config
from ldapauthenticator import LDAPAuthenticator

c = Config()

c.LDAPAuthenticator.server_address = "***"
c.LDAPAuthenticator.tls_strategy="before_bind"

c.LDAPAuthenticator.user_search_base = "dc=***,dc=***,dc=***"
c.LDAPAuthenticator.user_attribute = 'sAMAccountName'

c.LDAPAuthenticator.lookup_dn = True
c.LDAPAuthenticator.lookup_dn_user_dn_attribute = 'sAMAccountName'
c.LDAPAuthenticator.lookup_dn_search_user = '***'
c.LDAPAuthenticator.lookup_dn_search_password = '***'

# Run test

authenticator = LDAPAuthenticator(config=c)
username = input("Username: ")
password = getpass.getpass()
data = dict(username=username, password=password)
return_value = asyncio.run(authenticator.authenticate(None, data))

Logs

2024-10-30 13:25:23,668 [DEBUG] Attempting to bind ***
2024-10-30 13:25:24,686 [DEBUG] Successfully bound ***
2024-10-30 13:25:24,686 [DEBUG] Looking up user with:
    search_base = 'dc=***,dc=***,dc=***'
    search_filter = '(sAMAccountName=USERNAME)'
    attributes = '[sAMAccountName]'
2024-10-30 13:25:29,677 [WARNING] Failed to lookup a unique DN for username 'USERNAME'

LDAP search_filter response

[{'raw_dn': b'CN=Foo, Bar,OU=Users,OU=***,OU=***,OU=***,DC=***,DC=***,DC=***',
  'dn': 'CN=Foo, Bar,OU=Users,OU=***,OU=***,OU=***,DC=***,DC=***,DC=***',
  'raw_attributes': {'sAMAccountName': [b'USERNAME']}
  'attributes': {},
  'type': 'searchResEntry'},
 {'uri': ['ldap://***/DC=***,DC=***,DC=***,DC=***'],
  'type': 'searchResRef'}]

consideRatio commented 3 weeks ago

I think currently we can at least say this project is incorrectly concluding a failure to find a unique DN, because here is an example not about that but about getting multiple types of response entries where one isn't even providing a dn key because its about something else.

I found some details about this in:

Official LDAP docs: https://ldap.com/ldapv3-wire-protocol-reference-search/
A Python library we use: https://ldap3.readthedocs.io/en/latest/searches.html?highlight=searchresref#response

I figure we can look for type, and discard all searchResRef entries for example as one incremental improvement.

problem_causing_response = [
    {
        'raw_dn': b'CN=Foo, Bar,OU=Users,OU=***,OU=***,OU=***,DC=***,DC=***,DC=***',
        'dn': 'CN=Foo, Bar,OU=Users,OU=***,OU=***,OU=***,DC=***,DC=***,DC=***',
        'raw_attributes': { 'sAMAccountName': [b'USERNAME'] },
        'attributes': { 'sAMAccountName': 'USERNAME' },
        'type': 'searchResEntry',
    },
    {
        'uri': ['ldap://***/DC=***,DC=***,DC=***,DC=***'],
        'type': 'searchResRef',
    },
]

test_server_response = [
    {
        'raw_dn': b'cn=Philip J. Fry,ou=people,dc=planetexpress,dc=com',
        'dn': 'cn=Philip J. Fry,ou=people,dc=planetexpress,dc=com',
        'raw_attributes': {
            'cn': [b'Philip J. Fry'],
        },
        'attributes': {'cn': ['Philip J. Fry']},
        'type': 'searchResEntry',
    }
]

consideRatio commented 3 weeks ago

@franciscaestecker looking at the response you got, I see it doesn't include entries under attributes, was that correct? If so, filterng out searchResRef etc will just lead to another error because we will later require that attributes has the key self.lookup_dn_user_dn_attribute (sAMAccountName in your case) within it.

response[0]["attributes"][self.lookup_dn_user_dn_attribute]

franciscaestecker-bb commented 3 weeks ago

Apologies, this is a more accurate response. Commenting the if statements allows it to work for me.

problem_causing_response = [
    {
        'raw_dn': b'CN=Foo, Bar,OU=Users,OU=***,OU=***,OU=***,DC=***,DC=***,DC=***',
        'dn': 'CN=Foo, Bar,OU=Users,OU=***,OU=***,OU=***,DC=***,DC=***,DC=***',
        'raw_attributes': { 'sAMAccountName': [b'USERNAME'] },
        'attributes': { 'sAMAccountName': 'USERNAME' },
        'type': 'searchResEntry',
    },
    {
        'uri': ['ldap://***/DC=***,DC=***,DC=***,DC=***'],
        'type': 'searchResRef',
    },
]

consideRatio commented 3 weeks ago

I think a clear improvement is to filter the responses to those with type = searchResEntry, like that, it should work and avoid situations with multiple "result entry" responses which implies multiple people where one is picked somewhat arbitrarily.

manics commented 3 weeks ago

I'm more concerned about other types of responses that we have to filter/exclude, which is why I suggested something configurable by the admin.

However if we know filtering by type = searchResEntry will always work then it's reasonable to hard-code it with it being configurable.

franciscaestecker-bb commented 3 weeks ago

Ldap3 docs give these possible results:

searchResEntry searchResRef searchResDone intermediateResponse extendResp

Source: https://ldap3.readthedocs.io/en/latest/searches.html (under Response)

franciscaestecker-bb commented 3 weeks ago

Behaviour prior to change assumed first result was valid and had the body structure of a searchResEntry.

I think it's safe to change the check to only check for a unique searchResEntry.

This should achieve the additional desired validation.

consideRatio commented 3 weeks ago

I opened https://github.com/jupyterhub/ldapauthenticator/pull/294 about this for now, looking to work it to completion but I'm not sure exactly on what I think is a suitable path yet.

consideRatio commented 3 weeks ago

@franciscaestecker-bb are you able to adjust the code and try things? I'm curious about if things resolve if you adjust this line of code:

https://github.com/jupyterhub/ldapauthenticator/blob/0af43b87e37d2fd5d4b504bac729bfb617980393/ldapauthenticator/ldapauthenticator.py#L458

To be...

response = conn.entries

jupyterhub / ldapauthenticator