apache / logging-log4j2

Apache Log4j 2 is a versatile, feature-rich, efficient logging API and backend for Java.
https://logging.apache.org/log4j/2.x/
Apache License 2.0
3.37k stars 1.61k forks source link

Reload of key/trustsore when re-establishing a connection #3074

Open MichaelMorrisEst opened 1 week ago

MichaelMorrisEst commented 1 week ago

https://github.com/apache/logging-log4j2/pull/2767 introduces functionality to enable reloading key/trustore when the certs are renewed. However a manual step of triggering a reconfiguration (e.g. by touching the config file) is needed for the key/trust store to be reloaded. While this is a big improvement on having no reload, it is still not ideal to have to trigger a reconfiguration.

The cert renewal has no impact on existing established connections (as the handshake is done when the connection is established) so there is no need for the key/trust store to be reloaded for existing connections to continue working. However, when an error occurs in writing to the socket a retry is attempted which includes the creation of a new socket and connection. Using a no longer valid cert here will prohibit the connection being re-established. If, during the retry, the key/truststore are reloaded, then the latest certs would always be used in re-establishing the connection and would effectively remove the need to trigger the reconfiguration.

Is this something the community would be open accepting a PR on? If so I can work on it and submit

vy commented 1 week ago

If, during the retry, the key/truststore are reloaded, then the latest certs would always be used in re-establishing the connection and would effectively remove the need to trigger the reconfiguration.

@MichaelMorrisEst, this is something crossed my mind while working on #2767, but I am not sure about the implications of trying to reload key/trust store on every new socket creation attempt. [Thinking out loud here...] If the server has indeed gained a new identity, reloading key/trust stores will solve the problem swiftly. But what if the failure is due to something else? If we decide to implement this, shall this be the default, or opt-in via a new attribute?

Another concern I have regarding this approach is we will be introducing an exception to the reconfiguration logic. That is, when a Log4j configuration is loaded, it is treated more or less immutable – obviously, except LoggerConfigs. Now we will introduce a "please reload only this little configuration element" exception. I am not saying this is something bad per se. But exceptions generally bite us in the long run.

@ppkarwasz, any thoughts?

MichaelMorrisEst commented 1 week ago

Hi @vy, thanks for the feedback

@MichaelMorrisEst, this is something crossed my mind while working on #2767, but I am not sure about the implications of trying to reload key/trust store on every new socket creation attempt. [Thinking out loud here...] If the server has indeed gained a new identity, reloading key/trust stores will solve the problem swiftly. But what if the failure is due to something else? If we decide to implement this, shall this be the default, or opt-in via a new attribute?

If reloading at every re-connection is a concern then a slightly different approach could be to do the re-connection as today (i.e. don't reload the stores) but catch SSLHandshakeException and, when encountered then do the reload and try again. This would limit the reloading to only the relevant scenario of a handshake problem. I have no strong opinion on whether it should be default behaviour or opt in, and am happy to go with whichever the community prefers (if there is agreement on proceeding)

Another concern I have regarding this approach is we will be introducing an exception to the reconfiguration logic. That is, when a Log4j configuration is loaded, it is treated more or less immutable – obviously, except LoggerConfigs. Now we will introduce a "please reload only this little configuration element" exception. I am not saying this is something bad per se. But exceptions generally bite us in the long run.

The key/truststore are already not handled the same as other configuration elements though (as changes to them do not trigger a reconfig). As they are excluded from the normal reconfiguration logic, then the only way to support changes to them is through something other than the normal reconfiguration logic.