dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.92k stars 4.64k forks source link

Linux: Support tcp keep-alive with openldap #57248

Open RoadTrain opened 3 years ago

RoadTrain commented 3 years ago

Description

On Linux, when RBAC claims resolution is enabled, requests to LDAP fail with LdapException: The LDAP server is unavailable after some period of inactivity (~30 minutes).

Configuration

Regression?

Unknown

Other information

Though not definite, our experiments point to the TCP connection being forsibly closed for inactivity by some external mechanism (kernel? load-balancer?) as a likely reason. When we implemented periodic rebind by timer, the problem went away. There's an option TcpKeepAlive available in LdapSessionOptions, but it seems like it's not supported on Linux. In OpenLdap there are three options available to set up keep-alive:

LDAP_OPT_X_KEEPALIVE_IDLE
LDAP_OPT_X_KEEPALIVE_PROBES
LDAP_OPT_X_KEEPALIVE_INTERVAL

That would probably do the trick, but further investigation is needed.

danmoseley commented 3 years ago

Thanks for the report @RoadTrain . It may be a while before someone could investigate. Do you have an interest in trying those options to see whether it fixes it for you?

RoadTrain commented 3 years ago

@danmoseley I think I can try. It might take a while though, but I'll keep this issue updated.

As a side note, connectionless LDAP might help as well, need to check with #52904

danmoseley commented 3 years ago

Thank you @RoadTrain. We have seen interesting issues emerge in the S.DS.* libraries in real world use with various servers etc so it is particularly helpful when folks contribute that have such environments and often times better domain knowledge than we have.

SilvioSodre commented 1 month ago

Hi All!

I think this issue is a quite similar to issues 90024, 82430 e 94105. An error "The LDAP server is unavailable" emitted by BeginSendRequest when it try to rebind a ldap connection after a short random time interval (5-30 minutes).

I liked the workaround mentioned by @RoadTrain , but I don't intend to implement it. For sure, there is a solution to this issue besides an implementation a sidecar code to keep the ldap connection up.

Has someone got this error solved? I have seen in the Github forum many issues pointing to the same problem, but neither of them has really solved it. Has someone has any clue to solve the problem?