Graylog2 / graylog2-server

Free and open log management
https://www.graylog.org
Other
7.32k stars 1.05k forks source link

Server.log flooded with `ActiveDirectory error com.unboundid.ldap.sdk.LDAPException: An error occurred while attempting to connect to server` #14660

Open drewmiranda-gl opened 1 year ago

drewmiranda-gl commented 1 year ago

After upgrading from Graylog 5.0.2 to 5.0.3, i noticed a steady increase in cpu load on my graylog server:

image (19)

Its important to note that (for better or worse) I am ingesting Graylog's own server.log.

Full error message ``` 2023-02-13T08:45:18.645-06:00 ERROR [ADAuthServiceBackend] ActiveDirectory error com.unboundid.ldap.sdk.LDAPException: An error occurred while attempting to connect to server server.domain.tld:389: IOException(LDAPException(resultCode=91 (connect error), errorMessage='Unable to establish a connection to server server.domain.tld/x.x.x.x:389 within the configured timeout of 2000 milliseconds.', ldapSDKVersion=5.1.1, revision=580fabe31b0752099ccd9a835fe7da96e8251e28)) at com.unboundid.ldap.sdk.LDAPConnection.connect(LDAPConnection.java:915) ~[graylog.jar:?] at com.unboundid.ldap.sdk.LDAPConnection.connect(LDAPConnection.java:802) ~[graylog.jar:?] at com.unboundid.ldap.sdk.LDAPConnection.connect(LDAPConnection.java:740) ~[graylog.jar:?] at com.unboundid.ldap.sdk.LDAPConnection.(LDAPConnection.java:560) ~[graylog.jar:?] at com.unboundid.ldap.sdk.SingleServerSet.getConnection(SingleServerSet.java:329) ~[graylog.jar:?] at com.unboundid.ldap.sdk.FailoverServerSet.getConnection(FailoverServerSet.java:688) ~[graylog.jar:?] at com.unboundid.ldap.sdk.FailoverServerSet.getConnection(FailoverServerSet.java:592) ~[graylog.jar:?] at org.graylog.security.authservice.ldap.UnboundLDAPConnector.connect(UnboundLDAPConnector.java:130) ~[graylog.jar:?] at org.graylog.security.authservice.backend.ADAuthServiceBackend.authenticateAndProvision(ADAuthServiceBackend.java:89) [graylog.jar:?] at org.graylog.security.authservice.AuthServiceAuthenticator.authenticate(AuthServiceAuthenticator.java:102) [graylog.jar:?] at org.graylog.security.authservice.AuthServiceAuthenticator.authenticate(AuthServiceAuthenticator.java:65) [graylog.jar:?] at org.graylog2.security.realm.UsernamePasswordRealm.doGetAuthenticationInfo(UsernamePasswordRealm.java:93) [graylog.jar:?] at org.graylog2.security.realm.UsernamePasswordRealm.doGetAuthenticationInfo(UsernamePasswordRealm.java:71) [graylog.jar:?] at org.apache.shiro.realm.AuthenticatingRealm.getAuthenticationInfo(AuthenticatingRealm.java:571) [graylog.jar:?] at org.apache.shiro.authc.pam.ModularRealmAuthenticator.doMultiRealmAuthentication(ModularRealmAuthenticator.java:225) [graylog.jar:?] at org.apache.shiro.authc.pam.ModularRealmAuthenticator.doAuthenticate(ModularRealmAuthenticator.java:275) [graylog.jar:?] at org.apache.shiro.authc.AbstractAuthenticator.authenticate(AbstractAuthenticator.java:198) [graylog.jar:?] at org.apache.shiro.mgt.AuthenticatingSecurityManager.authenticate(AuthenticatingSecurityManager.java:106) [graylog.jar:?] at org.apache.shiro.mgt.DefaultSecurityManager.login(DefaultSecurityManager.java:275) [graylog.jar:?] at org.apache.shiro.subject.support.DelegatingSubject.login(DelegatingSubject.java:260) [graylog.jar:?] at org.graylog2.shared.security.ShiroSecurityContext.loginSubject(ShiroSecurityContext.java:107) [graylog.jar:?] at org.graylog2.shared.security.ShiroAuthenticationFilter.filter(ShiroAuthenticationFilter.java:48) [graylog.jar:?] at org.glassfish.jersey.server.ContainerFilteringStage.apply(ContainerFilteringStage.java:108) [graylog.jar:?] at org.glassfish.jersey.server.ContainerFilteringStage.apply(ContainerFilteringStage.java:44) [graylog.jar:?] at org.glassfish.jersey.process.internal.Stages.process(Stages.java:173) [graylog.jar:?] at org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:247) [graylog.jar:?] at org.glassfish.jersey.internal.Errors$1.call(Errors.java:248) [graylog.jar:?] at org.glassfish.jersey.internal.Errors$1.call(Errors.java:244) [graylog.jar:?] at org.glassfish.jersey.internal.Errors.process(Errors.java:292) [graylog.jar:?] at org.glassfish.jersey.internal.Errors.process(Errors.java:274) [graylog.jar:?] at org.glassfish.jersey.internal.Errors.process(Errors.java:244) [graylog.jar:?] at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:265) [graylog.jar:?] at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:234) [graylog.jar:?] at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:684) [graylog.jar:?] at org.glassfish.jersey.grizzly2.httpserver.GrizzlyHttpContainer.service(GrizzlyHttpContainer.java:356) [graylog.jar:?] at org.glassfish.grizzly.http.server.HttpHandler$1.run(HttpHandler.java:200) [graylog.jar:?] at com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:180) [graylog.jar:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:?] at java.lang.Thread.run(Unknown Source) [?:?] Caused by: java.io.IOException: LDAPException(resultCode=91 (connect error), errorMessage='Unable to establish a connection to server server.domain.tld/x.x.x.x:389 within the configured timeout of 2000 milliseconds.', ldapSDKVersion=5.1.1, revision=580fabe31b0752099ccd9a835fe7da96e8251e28) at com.unboundid.ldap.sdk.LDAPConnectionInternals.(LDAPConnectionInternals.java:204) ~[graylog.jar:?] at com.unboundid.ldap.sdk.LDAPConnection.connect(LDAPConnection.java:904) ~[graylog.jar:?] ... 39 more Caused by: com.unboundid.ldap.sdk.LDAPException: Unable to establish a connection to server server.domain.tld/x.x.x.x:389 within the configured timeout of 2000 milliseconds. at com.unboundid.ldap.sdk.ConnectThread.getConnectedSocket(ConnectThread.java:281) ~[graylog.jar:?] at com.unboundid.ldap.sdk.LDAPConnectionInternals.(LDAPConnectionInternals.java:185) ~[graylog.jar:?] at com.unboundid.ldap.sdk.LDAPConnection.connect(LDAPConnection.java:904) ~[graylog.jar:?] ... 39 more ```

I understand its normal and expected to show this error under this condition (yes the server is truly unreachable), however this message appears about ~14,000 times per hour (233/min, 3.8/sec) which seems excessive.

Expected Behavior

The above error isn't logged as frequently. Not sure what is reasonable.

Current Behavior

Log flooded with the above message.

Possible Solution

I can't be certain this is related to 5.0.3 but i suspect thats when this first started to occur. Not sure if there were any AD backend changes in that release (i don't think there were?)

Steps to Reproduce (for bugs)

  1. Create an AD backend
  2. Test/verify
  3. Power off the server

Context

I understand this is possibly an edge case and obviously no one should be configuring an AD backend and then turning off the server. However, there could be times when the back is truly inaccessible and this could flood the logs on the graylog server.

Your Environment

Please let me know if there are any questions. I understand if this is working as intended. Wanted to be certain there isn't anything else going on here that could potentially cause or lead to issues in the future.

drewmiranda-gl commented 1 year ago

I did a bit more digging, i tested to see if this was caused by the 'Active Directory' authentication service. To test this i reviewed logs for logon events on the domain controller (event_code:4624) and compared logons from the graylog server when the authentication service was activated or deactivated.

image

I unfortunately don't have historical logs that go back very far to see what the baseline was and if that amount of individual logon reqs is normal/expected or has suddenly increased. I wouldn't expect there to be so many logon requests given that there is no one even loggong onto graylog with an AD user (i exclusively use a local graylog account). Its a bit odd.