apache / logging-log4j2

Apache Log4j 2 is a versatile, feature-rich, efficient logging API and backend for Java.
https://logging.apache.org/log4j/2.x/
Apache License 2.0
3.33k stars 1.58k forks source link

`StrSubstitutor` cannot parse nested braces #2679

Open vy opened 2 months ago

vy commented 2 months ago

As reported in #2666,

<PatternLayout pattern="${env:TZ:-%d{yyyy-MM-dd HH:mm:ss}{GMT+00}}"/>

leaves behind a trailing } while formatting date. That is, StrSubstitutor doesn't correctly parse nested braces.

ppkarwasz commented 1 month ago

To be more precise, the StrLookup parses the string ${env:TZ:-%d{yyyy-MM-dd HH:mm:ss}{GMT+00}} as:

  1. A lookup expression (${env:TZ:-%d{yyyy-MM-dd HH:mm:ss}), where env:TZ is the key used for the lookup and %d{yyyy-MM-dd HH:mm:ss} is the default value.
  2. The literal {GMT+00}}.

The main problem here is to have a coherent set of escape rules for lookup expressions and pattern expressions. Right now we have the following escaping rules:

This resembles more the way cmd.exe parses a command line than the simpler bash expansion.

ngocnhan-tran1996 commented 1 month ago

Hi @vy , @ppkarwasz

I have a PR but I think this has more edge case. Please take a look and tell me if I miss anything

https://github.com/apache/logging-log4j2/pull/2712

ppkarwasz commented 1 month ago

Looking at bash parameter expansion, I would propose to just introduce a simple escape mechanism that allows users to escape the closing } as \}. The initial example would then need to be written as:

<PatternLayout pattern="${env:TZ:-%d{yyyy-MM-dd HH:mm:ss\}{GMT+00\}}"/>

This does not make the escaping rules coherent, but at least they are simpler:

Using \ as a generic escaping mechanism is not possible for backward compatibility, since the default replacement could be a Windows path.

Note: Currently StrLookup allows nested lookups anywhere between ${ and the closing }. I would prefer to limit them to the default value part. So ${foo:-${bar}} would be allowed, but ${${foo}${bar}} wouldn't. I really don't see a reason why the names to lookup should be compose by other lookups.

ppkarwasz commented 1 month ago

After discussing in with @vy on Slack, I believe we should coherently use $ as escape character, with the following rules:

  1. The escape character $ only applies to the $, :, - and } characters.
  2. An escaped character loses its special meaning. So:
    • $${...} is not expanded,
    • ${sys$:-foo} and ${sys:$-foo} are replaced with the value of the system property -foo,
    • ${foo:-$}} is expanded as } if foo is not defined.
ngocnhan-tran1996 commented 1 month ago

I am looking forward to log4j2 new version, so many things have to done :D