noirello / bonsai

Simple Python 3 module for LDAP, using libldap2 and winldap C libraries.
MIT License
116 stars 32 forks source link

Invalid parsing of hostnames #43

Closed giethmon closed 3 years ago

giethmon commented 3 years ago

Hi, i have found some misttakes in the parsing regex inside LDAPURL.__str2url().

The expression to split URLs is (using re.VERBOSE):

r"""
    ^
    (ldap[s|i]?)                    # (1: scheme)
    ://
    (                               # (2:
        ([^:/?]*)?                  #   (3: host)
        (                           #   (4
            [:]                     #       ':'
            ([1-9][0-9]{0,4})       #       (5: port)
        )?                          #   )?

        |                           #   --OR--

        [\[]?                       #   '['?
        ([^/?\]]*)                  #   (6: host)
        (                           #   (7:
            [\]]                    #       ']'
            [:]                     #       ':'
            ([1-9][0-9]{0,4})       #       (8: port)
        )?                          #   )
    )                               # )
    [/]?
    ([^\]/:?]*)?                    # (9: bind DN)
    [\?]?
    ([^\]:?]*)?                     # (10: attributes)
    [\?]?
    ([^\]:?]*)?                     # (11: scope)
    [\?]?
    ([^\]:?]*)?                     # (12: filter)
    [\?]?
    ([^\]:?]*)?                     # (13: extensions)
    $
"""

So you can have a scheme of "ldap", "ldaps", "ldap|" or "ldapi". I think you mean (ldap[si]?) instead?

The server part (group 2) matches i.e. on

    (empty string)
    host
    [host         # Error
    host:99
    host]:99      # Error
    [host]:99
    :99
    ]:99          # Error
    []:99         # Error?

But never ever you can have a "[host]" without specifying port number (because the closing bracket is part of group 7).

Additionally, the separator between the server part (group 2) and the possibly following base-dn (group 9) can be omitted. But that can't work.

My suggestion is to use urlpase or urlsplit from urllib.parse - see urllib.parse — Split URLs into Components

To do this you need to extend the following class variables of urllib.parse with the ldap schemes {'ldap', 'ldaps', 'ldapi'}:

    urllib.parse.uses_netloc
    urllib.parse.uses_query
    urllib.parse.uses_params    # maybe

Btw: is_valid_hostname: To parse or check any raw ip addresses use ipaddress.ip_address. It returns an address object (IPv4Address or IPv6Address) with a version attribute showing a value of 4 or 6 respectively.

noirello commented 3 years ago

Thank you, I've no idea why I haven't thought about using urllib.parse before.

noirello commented 3 years ago

Changed in 1.2.1.

giethmon commented 3 years ago

Thanks a lot

and have a good new year!

-- Mit freundlichen Grüssen / Best greetings Thomas Gierloff

   

Gesendet: Donnerstag, 31. Dezember 2020 um 14:44 Uhr Von: "noirello" notifications@github.com An: "noirello/bonsai" bonsai@noreply.github.com Cc: "giethmon" t.gierloff@web.de, "Author" author@noreply.github.com Betreff: Re: [noirello/bonsai] Invalid parsing of hostnames (#43)

 

Closed #43.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.