cannatag / ldap3

a strictly RFC 4510 conforming LDAP V3 pure Python client. The same codebase works with Python 2. Python 3, PyPy and PyPy3
Other
878 stars 271 forks source link

Remove server from pool in case of TLS error. #771

Open cornelinux opened 4 years ago

cornelinux commented 4 years ago

Problem

We configure a server pool to connect to Microsoft AD servers. We are using startTLS. One of the servers for some reason had a missing certificate. Well, it could also be an expired certificate.

However, this one server - unfortunately it is the first in the pool - creates an TLS error. The complete LDAP connection stops with startTLS failed - unavailable. Although the other servers in the pool are fine and could be queried.

It would be great, if a TLS Error of one server in the pool would simply remove this server from the pool.

Analysis

To my understanding it happens here: https://github.com/cannatag/ldap3/blob/master/ldap3/core/tls.py#L268 An LDAPStartTLSError is raised.

Unfortunately I did not find the location in ldap3/core/connection.py, yet, where it bails out.

Proposal

To my understanding at some points servers are removed from the pool, when they are not responsive. I suggest that at this point we could also catch the LDAPStartTLSError and remove the server from the pool. This way the other servers, that do still have a working TLS config, would still answer the request.

A Warning in the logs would be fine, but a complete bail out is problematic for us. A server can have TLS issues when the certificate expires, when the certificate is missing (imagine a new DC, that did not pull a certificate, yet), when for some reasons the DNS fails... It would be OK for me, to have this configurable, so that the old behaviour is not changed.

cannatag commented 4 years ago

Hi Cornelius, your analysis is correct, I think the behavior you propose should be the default. It makes sense to remove the faulty server from the pool if the connection can’t be established due to some TLS errors.

Will work on this issue in the next release of ldap3.

Let me know if this is an urgent issue for you, because I’ve slowed down the pace of the releases because the ldap3 library is quite stable now.

Bye, Giovanni

Il giorno 12 dic 2019, alle ore 07:54, Cornelius Kölbel notifications@github.com ha scritto:

 Problem

We configure a server pool to connect to Microsoft AD servers. We are using startTLS. One of the servers for some reason had a missing certificate. Well, it could also be an expired certificate.

However, this one server - unfortunately it is the first in the pool - creates an TLS error. The complete LDAP connection stops with startTLS failed - unavailable. Although the other servers in the pool are fine and could be queried.

It would be great, if a TLS Error of one server in the pool would simply remove this server from the pool.

Analysis

To my understanding it happens here: https://github.com/cannatag/ldap3/blob/master/ldap3/core/tls.py#L268 An LDAPStartTLSError is raised.

Unfortunately I did not find the location in ldap3/core/connection.py, yet, where it bails out.

Proposal

To my understanding at some points servers are removed from the pool, when they are not responsive. I suggest that at this point we could also catch the LDAPStartTLSError and remove the server from the pool. This way the other servers, that do still have a working TLS config, would still answer the request.

A Warning in the logs would be fine, but a complete bail out is problematic for us. A server can have TLS issues when the certificate expires, when the certificate is missing (imagine a new DC, that did not pull a certificate, yet), when for some reasons the DNS fails... It would be OK for me, to have this configurable, so that the old behaviour is not changed.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

cornelinux commented 4 years ago

Hi Giovanni, thanks for the feedback. I think it is not that pressing. The problem with the one installation has been resolved for now. But it could return. Well, basically it should not happen if people took care for their certificates. Unfortunately they not always do.

It is good to know, that you are willing to address it. How is currently your release cycle? If I can assist at any point, I will happily do so. Regards, Cornelius

cannatag commented 4 years ago

Basically the release cycle is based on bug fixing, I think this project is quite mature and I don’t expect any new major feature to be included. The library fits my needs (my main work is on a quite large Identity and Access Management system). I know it lacks features needed for AD, but I’m quite lucky for not working with Microsoft Active Directory :).

In the main phase of development (up to version 2) I had an online labs with ldap servers from various vendors (including Microsoft) for automatic testing, but I dismissed the infrastructure because of its costs. Now I just use some VM when testing (without AD). What I really lack is a lab for testing with Active Directory, but this is a low priority to me, even if I understand that it is widely used. So if you could provide testing on AD this could really help.

Bye, Giovanni

Il giorno 12 dic 2019, alle ore 22:33, Cornelius Kölbel notifications@github.com ha scritto:

 Hi Giovanni, thanks for the feedback. I think it is not that pressing. The problem with the one installation has been resolved for now. But it could return. Well, basically it should not happen if people took care for their certificates. Unfortunately they not always do.

It is good to know, that you are willing to address it. How is currently your release cycle? If I can assist at any point, I will happily do so. Regards, Cornelius

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

cornelinux commented 4 years ago

We are running our system in installations with AD currently with ldap3 version 2.6. We can also test this with development/master branch, when you tell me what to do or how to run (the unittests in tests/?) with an AD setup.