noirello / bonsai

Simple Python 3 module for LDAP, using libldap2 and winldap C libraries.
MIT License
116 stars 32 forks source link

Tolerating DNs with whitespace #52

Closed dreness closed 2 years ago

dreness commented 3 years ago

Hi,

I'm willing to believe the assertion that ldapdn.py only deals in valid DNs, and the unstated implication that DNs with whitespace in them may not be technically valid, however some records at my site have this affliction, and they seem to work fine with (almost) all other LDAP tooling I've used. I don't know if it's worth adding an option to permit this in bonsai, but incase anybody else finds themselves in this situation, one naive solution is to just remove whitespace from strdn at the top of LDAPDN's __init__ method in ldapdn.py, e.g.:


      def __init__(self, strdn: str) -> None:
+         strdn = strdn.replace(" ", "")
          if strdn != "" and not self._dnregex.match(strdn):
              raise InvalidDN(strdn)
noirello commented 3 years ago

Hi, could you give me some examples of DNs that bonsai's LDAPDN considers invalid?

dreness commented 3 years ago

Hi - sure, a sanitized example would be:

Id=56585, ou=things, o=company

All of my records with spaces in the DN have them in this same form, after the commas between the DN components

noirello commented 3 years ago

Well, If I interpret the RFC4514 (and RFC4512) correctly, then anything between the comma and the equal sign is an attribute type that should start with a letter([A-Za-z]).

Although, the fix to accept DNs with spaces should be an easy one:

-   _attrtype = r"[A-Za-z][\w-]*|\d+(?:\.\d+)*"
+   _attrtype = r"[A-Za-z ][\w-]*|\d+(?:\.\d+)*"
    _attrvalue = r'#(?:[\dA-Fa-f]{2})+|(?:[^,=\+<>#;\\"]|\\[,=\+<>#;\\" ]' r'|\\[\dA-Fa-f]{2})*|"(?:[^\\"]|\\[,=\+<>#;\\"]|\\[\dA-Fa-f]{2})*"'

But I don't know, this patch could have some unforeseen consequences.

Could you tell me how do you get these DNs (e.g. from simple OpenLDAP server)? Just trying to understand if there's anything unusual in your environment.

dreness commented 3 years ago

Well, If I interpret the RFC4514 (and RFC4512) correctly, then anything between the comma and the equal sign is an attribute type that should start with a letter([A-Za-z]).

I agree with your reading of the RFCs, and indeed these 'extra' surrounding spaces aren't part of relativeDistinguishedName - they can't be, as they aren't valid at the start of an attribute name as per above, nor at the end of an attribute value according to 4512 2.4 which states that an attribute value can't have:

 - a space (' ' U+0020) or number sign ('#' U+0023) occurring at the beginning of the string;
 - a space (' ' U+0020) character occurring at the end of the string;
 - one of the characters '"', '+', ',', ';', '<', '>',  or '\' (U+0022, U+002B, U+002C, U+003B, U+003C, U+003E, or U+005C, respectively);
 - the null (U+0000) character

The spec for composing a DN from a sequence of RDNs is equally clear: the RDN pairs are delimited by COMMA.

distinguishedName = [ relativeDistinguishedName
          *( COMMA relativeDistinguishedName ) ]
      relativeDistinguishedName = attributeTypeAndValue
          *( PLUS attributeTypeAndValue )
      attributeTypeAndValue = attributeType EQUALS attributeValue
      attributeType = descr / numericoid
      attributeValue = string / hexstring

I would summarize the rationale for tolerating whitespace on either side of the delimiter in a DN as follows (note: my site has DNs with either zero or one whitespace either before or after the comma):

Thanks for the consideration :)

noirello commented 2 years ago

Released 1.3.0 and now spaces after comma is acceptable in DNs.