Open drewmiranda-gl opened 1 year ago
I did a bit more digging, i tested to see if this was caused by the 'Active Directory' authentication service. To test this i reviewed logs for logon events on the domain controller (event_code:4624
) and compared logons from the graylog server when the authentication service was activated or deactivated.
I unfortunately don't have historical logs that go back very far to see what the baseline was and if that amount of individual logon reqs is normal/expected or has suddenly increased. I wouldn't expect there to be so many logon requests given that there is no one even loggong onto graylog with an AD user (i exclusively use a local graylog account). Its a bit odd.
After upgrading from Graylog 5.0.2 to 5.0.3, i noticed a steady increase in cpu load on my graylog server:
Its important to note that (for better or worse) I am ingesting Graylog's own
server.log
.Full error message
``` 2023-02-13T08:45:18.645-06:00 ERROR [ADAuthServiceBackend] ActiveDirectory error com.unboundid.ldap.sdk.LDAPException: An error occurred while attempting to connect to server server.domain.tld:389: IOException(LDAPException(resultCode=91 (connect error), errorMessage='Unable to establish a connection to server server.domain.tld/x.x.x.x:389 within the configured timeout of 2000 milliseconds.', ldapSDKVersion=5.1.1, revision=580fabe31b0752099ccd9a835fe7da96e8251e28)) at com.unboundid.ldap.sdk.LDAPConnection.connect(LDAPConnection.java:915) ~[graylog.jar:?] at com.unboundid.ldap.sdk.LDAPConnection.connect(LDAPConnection.java:802) ~[graylog.jar:?] at com.unboundid.ldap.sdk.LDAPConnection.connect(LDAPConnection.java:740) ~[graylog.jar:?] at com.unboundid.ldap.sdk.LDAPConnection.I understand its normal and expected to show this error under this condition (yes the server is truly unreachable), however this message appears about ~14,000 times per hour (233/min, 3.8/sec) which seems excessive.
Expected Behavior
The above error isn't logged as frequently. Not sure what is reasonable.
Current Behavior
Log flooded with the above message.
Possible Solution
I can't be certain this is related to 5.0.3 but i suspect thats when this first started to occur. Not sure if there were any AD backend changes in that release (i don't think there were?)
Steps to Reproduce (for bugs)
Context
I understand this is possibly an edge case and obviously no one should be configuring an AD backend and then turning off the server. However, there could be times when the back is truly inaccessible and this could flood the logs on the graylog server.
Your Environment
Please let me know if there are any questions. I understand if this is working as intended. Wanted to be certain there isn't anything else going on here that could potentially cause or lead to issues in the future.