apache / logging-log4j2

Apache Log4j 2 is a versatile, feature-rich, efficient logging API and backend for Java.
https://logging.apache.org/log4j/2.x/
Apache License 2.0
3.35k stars 1.59k forks source link

PatternLayout unsuitable for production purposes #2518

Open marcelhoelscher opened 4 months ago

marcelhoelscher commented 4 months ago

Hello,

I came across the following statement on the Log4J security page (https://logging.apache.org/security.html#reporting):

When using an unstructured layout such as PatternLayout, no guarantees can be made about the output format. This layout is mainly useful for development purposes and should not be relied on in production applications. For example, if a log message contains new lines, these are not escaped or encoded specially unless the configured pattern uses the %encode{pattern}{CRLF} wrapper pattern converter (which will encode a carriage return as the string \rand a line feed as the string \n) or some other %encode option. Note that %xEx is appended to the pattern unless already present. Similarly, other encoding options are available for other formats, but pattern layouts cannot make assumptions about the entire output. As such, when using unstructured layouts, no user-controlled input should be included in logs. It is strongly recommended that a structured layout (e.g., JsonTemplateLayout) is used instead for these situations. Note that StrLookup plugins (those referenced by ${…​} templates in configuration files) that contain user-provided input should not be referenced by layouts.

This text suggests that PatternLayout in Log4j should not be utilized in production environments. This information was both surprising and new to myself and my team members.

Beyond the aforementioned link, there appears to be no indication within the official Log4J documentation that PatternLayout is unsuitable for production use. In particular, such a warning is not present where a developer would typically look for information regarding PatternLayout (for example, here: https://logging.apache.org/log4j/2.x/manual/layouts.html).

Additionally, the Log4j documentation for PatternLayout asserts that log injection attacks can be mitigated by employing the wrapper pattern converter "%enc{%m}{CRLF}" within PatternLayout. However, it fails to mention that this mechanism does not work under all circumstances.

Consequently, developers may opt to use PatternLayout in production, under the false assumption that they are safeguarded against log injection scenarios by using %enc{%m}{CRLF}, even when this protection is not guaranteed.

Could you please address and document these aspects of PatternLayout more explicitly to prevent any potential misunderstandings?

Kind regards

ppkarwasz commented 4 months ago

Hi @marcelhoelscher,

Sure, we should probably change the wording on the security page:

Even a simple:

<PatternLayout pattern="%d [%t] %-5p %c - %m%n"/>

can contain a CR or LF character basically everywhere:

You can configure PatternLayout to output one log event per line using:

<PatternLayout pattern="%enc{%d [%t] %-5p %c - %m%notEmpty{%n%xEx}}%n"
               alwaysWriteExceptions="false"/>

but splitting the line into its components is extremely error prone.

marcelhoelscher commented 4 months ago

Hey @ppkarwasz,

thanks for your reply!

We're not concerned about developers putting CR or LF into log messages, we're concerned about attackers who (for example) manipulate query parameters in URLs.

So i think the relevant places in the log pattern are the message and the stacktrace. The message is safe when using the %encode{%m}{CRLF} conversion pattern. But the stacktraces are problematic.

I understand, that it is no option to simply escape any CR / LF because the stacktrace then gets unreadable and there is no way to differentiate between "desired" and "undesired" control characters. But developers searching for information about this topic may think that it's sufficient to wrap the log message into the conversion pattern. It may be not very obvious, that there still may be undesired control characters in the stacktraces. It may be worth mentioning, that there also exists the implicit %xEx-Pattern that may contain control characters.