quru / qis

Dynamic image server for web and print
https://quruimageserver.com
GNU Affero General Public License v3.0
89 stars 7 forks source link

Creation of new user accounts from LDAP auth is broken #8

Closed fozcode closed 6 years ago

fozcode commented 6 years ago

The switch to Python 3 and/or the switch from pyldap to python-ldap has caused a regression when creating a new user account via an LDAP login. From the logs:

2018-08-09 12:09:44,514 qis_30863  DEBUG    Authenticating user 'trevor'
2018-08-09 12:09:44,517 qis_30863  DEBUG    Checking LDAP server for unknown user 'trevor'
2018-08-09 12:09:44,545 qis_30863  DEBUG    Identified user 'trevor' on LDAP server, authentication OK, creating new user account
2018-08-09 12:09:44,546 qis_30863  DEBUG    User details: {'gidNumber': [b'1234'], 'cn': [b'Trevor Foo'], 'uid': [b'trevor'], 'givenName': [b'Trevor'], 'uidNumber': [b'1234'], 'sn': [b'Foo'], 'dn': ['uid=trevor,cn=users,cn=accounts,dc=foo,dc=bar,dc=baz']}
2018-08-09 12:09:44,546 qis_30863  ERROR    Error performing login: a bytes-like object is required, not 'str'

The code is trying to process that dictionary as strings whereas the entries now contain bytes (apart from the dn which is a str 🙄). I think python-ldap has a setting for whether it returns strings or bytes so we need to look at that. Alternatively we just need to str() those bytes values in the code.

fozcode commented 6 years ago

python-ldap's bytes_mode is for Python 2 only. In Python 3 the dict with mixed bytes and str (for dn) is actually documented behaviour:

http://www.python-ldap.org/en/latest/bytes_mode.html

What’s text, and what’s bytes

The LDAP protocol states that some fields (distinguished names, relative distinguished names, attribute names, queries) be encoded in UTF-8. In python-ldap, these are represented as text (str on Python 3, unicode on Python 2).

Attribute values, on the other hand, MAY contain any type of data, including text. To know what type of data is represented, python-ldap would need access to the schema, which is not always available (nor always correct). Thus, attribute values are always treated as bytes.

So this looks like a regression from the Python 3 port and needs fixing in code.