smallrye / smallrye-common

Common utilities for SmallRye
Apache License 2.0
21 stars 24 forks source link

Expression NO_$$ mode #344

Closed radcortez closed 2 weeks ago

radcortez commented 1 month ago

Proposal of a new escape expressions mode (NO_$$) to help with:

Activating NO_$$ will effectively disable the $$ escape used for expressions.

This also allows the removal of the workaround for MP Config: https://github.com/smallrye/smallrye-config/blob/d447b0c346e39a09c6d07c7d5ce8186bd4154037/implementation/src/main/java/io/smallrye/config/ExpressionConfigSourceInterceptor.java#L102-L123

dmlloyd commented 1 month ago

There are some subtle problems with this approach, but before I get into it, I want to ask: is there a reason why we don't change this line in ExpressionConfigSourceInterceptor to add the ESCAPES flag?

        Expression expression = Expression.compile(escapeDollarIfExists(configValue.getValue()), LENIENT_SYNTAX, NO_TRIM,
                NO_SMART_BRACES, DOUBLE_COLON);
radcortez commented 1 month ago

One of the reasons was that it doesn't play well with the comma escape for multiple elements in arrays, like array=cat,dog,${mouse},sea\\,turtle

dmlloyd commented 1 month ago

Sure, I can see that. However it is not possible to unambiguously discriminate in this case, so I think we may just have to "eat it" and require double-escaping for lists in order to truly fix the problem, or else come up with a single expression escaping scheme that also covers the list case (which is much harder than it would seem, since lists of lists are a thing).

radcortez commented 1 month ago

That is something we can explore.

Still, since we mandate the brackets to define the expression, I think the $ escape should not be applied when a random $$ is found in the middle of a nonexpression. From my understanding, the $ escape is there for cases where you have single letter expressions with MINI_EXPRS, but that is not our case, so it feels weird.

On another note, what are the subtle issues with this approach? :)

dmlloyd commented 4 weeks ago

Basically, this is a state machine (a DFA), and each state is well-defined (basically meaning, every character path is evaluated and tested for each state). By adding a boolean flag which sometimes changes the code path, the number of states has doubled. Unless every new state is tested (or proven not to be enterable) then there is a lacking of test coverage.

dmlloyd commented 4 weeks ago

I think you're right about $$ - that should be refactored to be a part of MINI_EXPRS, and having it be separate along with the lack of ESCAPES really causes confusion and problems. It'll break some things but I think the end result will be more consistent and easier to explain.

dmlloyd commented 4 weeks ago

Of course then I'll have to find all the places where I said that $$ is the official correct way to escape $ and update that. 😢

radcortez commented 4 weeks ago

Basically, this is a state machine (a DFA), and each state is well-defined (basically meaning, every character path is evaluated and tested for each state). By adding a boolean flag which sometimes changes the code path, the number of states has doubled. Unless every new state is tested (or proven not to be enterable) then there is a lacking of test coverage.

I added the flag make the checks easiers, but I think it could be writtent without it.

I think you're right about $$ - that should be refactored to be a part of MINI_EXPRS, and having it be separate along with the lack of ESCAPES really causes confusion and problems. It'll break some things but I think the end result will be more consistent and easier to explain.

Ok, so we have two options:

Either we go with the new expression mode and cover our current expected cases

or

We turn on ESCAPES and make $$ escape only for MINI_EXPRS. We know it breaks escaping commas for lists, which probably has a shallow usage (if any). We need to check if there are other cases in which it breaks.

radcortez commented 4 weeks ago

Of course then I'll have to find all the places where I said that $$ is the official correct way to escape $ and update that. 😢

Or we could still use it if $ is immediately followed by a { with an ending }, and ignore it in all the other cases.

radcortez commented 3 weeks ago

I've made a small change to remove the flag.

radcortez commented 2 weeks ago

All good! Thank you for your patience! :)