Dotted names in CDI conflicts with EL specification

jeanouii commented 1 year ago

CDI relies on EL but at the same time goes against what's the EL specification says. For instance, CDI makes it possible to have names with "." (dot) but it's forbidden in EL because the dot is used to navigate the tree and find sub nodes.

An implementation now has to create a bean manager based Resolver that has specific rules for CDI beans. The trick is that the EL Context can be populated but other means than just CDI, so "a.b" can lead to ambiguous resolutions.

Shouldn't we get ride of this edge case in CDI only?

rmannibucau commented 1 year ago

+1, this part never worked in the EE ecosystem and does not bring anything to CDI alone so better to deprecate and prune on the long run IMHO.

Azquelt commented 1 year ago

I'm not familiar with EL so I did some reading to try to understand this one. It does look like the EL spec says a.b is treated as a["b"] and then has rules for evaluating a[b]. I thought therefore that resolving a.b would result in trying to resolve a which would return null so the whole expression would evaluate to null or PropertyNotFoundException.

Nevertheless, I see we have a TCK test for resolving a name with dots in it. Can someone explain why that's expected to work?

jeanouii commented 1 year ago

https://jakarta.ee/specifications/cdi/4.0/jakarta-cdi-spec-4.0.html#names

manovotn commented 1 year ago

Notably, CDI bean names are not always linked to EL (but can be). As stated in https://jakarta.ee/specifications/cdi/4.0/jakarta-cdi-spec-4.0.html#bean_discovery_steps_full:

A bean may have a bean name. A bean with a name may be referred to by its name when used in a non typesafe environment (like the Unified Expression Language). A valid bean name is a period-separated list of valid EL identifiers.

Which points out that each part of the list is a valid EL identifier, but the whole string as such isn't said to be an EL identifier. We should probably look into how exactly we pass these expressions to EL.

jeanouii commented 1 year ago

https://jakarta.ee/specifications/expression-language/5.0/jakarta-expression-language-spec-5.0.html#evaluating-identifiers --> short but may help in the discussion

Maybe pulling in someone from EL specification would be helpful

Azquelt commented 1 year ago

Following on from the discussion on the call, we also noted 2.4.3.1. Ambiguous names where it says:

Suppose two beans are both available for injection in a certain module, and either:

the two beans have the same bean name and the name is not resolvable, or

the bean name of one bean is of the form x.y, where y is a valid bean name, and x is the bean name of the other bean,

the container automatically detects the problem and treats it as a deployment problem.

Coupled with this part from above:

A valid bean name is a period-separated list of valid EL identifiers.

It sounds like the intention is that if a bean is registered with the name magic.golden.fish then when the name is looked up as an EL expression:

the CDI ELResolver should be called to resolve "magic" and return some object (let's call it objMagic),
then called again to resolve the name "golden" with the base objMagic and return another object (objGolden),
then called again to resolve the name "fish" with the base objGolden and return the reference to the bean.

That would mean that the bean resolves as the TCK test expects it to, but also that if some other technology wants to make something available in EL with the name magic, then it would conflict with a bean named magic.golden.fish.

Ladicek commented 1 year ago

I'm no EL expert, but it seems to me that the EL specification actually allows this. The relevant evaluation mechanisms all have multiple options how something can be evaluated, and these are ordered. See e.g. here for simple identifiers or here for the . operator (as well as the [] operator). There's typically some "built-in" way of resolving something, and if that doesn't succeed, then an ELResolver is supposed to be consulted. The EL specification even directly says this about the ELResolver:

The EL API provides a generalized mechanism, an ELResolver, implemented by the underlying technology and which defines the rules that govern the resolution of model object names and their associated properties.

The CDI specification says that CDI implementations must provide an EL resolver that follows the CDI name resolution rules. It seems to me that this EL resolver just has to understand that bean names may be "hierarchical", and resolving an identifier may result in a "namespace" (in Weld parlance) for the "inner" components of a bean name, or in a bean itself for the last component.

I don't think this itself leads to conflicts any more than any other possible conflicts. For example, a variable named a may exist, in which case evaluating an EL expression ${a} will result in the value of variable a without consulting an ELResolver, even if it is capable of resolving a.

rmannibucau commented 1 year ago

@Ladicek think the key point is even if there is a SPI the integration points should be consistent and there CDI uses a hack assuming it will be alone as other ELResolver than the built in ones. Overall EL enables to name without dots beans and use dots to go down in the object graph, anything else is broken at some point until using a forbidden namespace (we could say jakarta is but it is not today)

manovotn commented 1 year ago

@Ladicek think the key point is even if there is a SPI the integration points should be consistent and there CDI uses a hack assuming it will be alone as other ELResolver than the built in ones.

From my understanding of the discussion we had during the meeting, it's not quite as easy. The same issue as Jean-Louis described might occur even without CDI in place as there can be >1 ELResolvers knowing how to resolve any given name foo . Whichever gets asked to resolve it first will win meaning the ordering can matter even now and without CDI in place. Basically what Ladislav describes in his last paragraph.

rmannibucau commented 1 year ago

@manovotn this is true but also why CDI is wrong, the dot has a meaning which is hierarchical and allowing dots in names breaks that design which is likely why dotted names should be deprecated in ELResolver ASAP and pruned following jakarta rules. The key difference is resolving an instance and navigating through it. Conflicting on names is resolved by the order but here it is not that, it is about conflicting between bean instances and subgraphs without enabling the user to control anything in the order (platform level).

That said, don't think pruning would be a huge deal due to the JSF usafe which is without dot in most cases and the pretty much not usage outside so maybe making EL integration an EE platform thing and no more a CDI itself integration can be an option too.

mkouba commented 1 year ago

I'd like to point out that bean name != EL value expression.

It's a period-separated list of valid EL identifiers; which implies that the provided ELResolver must be able to resolve some hierarchical structure, e.g. Weld has a notion of "namespaces". In other words, if magic.golden.fish represents an EL value expression then the ELResolver must be aware of the fact, that magic represents an object that resolves the identifier golden to an object that resolves the identifier fish (I believe that this is the same output as https://github.com/jakartaee/cdi/issues/669#issuecomment-1521954109).

For the names magic.golden.fish and magic.golden.dog we would need something like:

└─ magic
      └─ golden
             └─ fish
             └─ dog

So the object resolved for golden needs to know how to resolve fish and dog and return the corresponding contextual instances.

BTW The spec is clear that if there are two beans where the first has the name magic.golden and the other has name magic.golden.fish then a deployment problem occurs.

Furthermore, the bean name can be used in other non typesafe environments where a dot in the name does not need to be relevant at all.

rmannibucau commented 1 year ago

@mkouba and if magic.golden has a property fish spec is clear that it must work (EL) but it will not behave as expected when you will resolve your produced string named magic.golden.fish.

So current status is:

EL + CDI -> does not work together reliably on a spec standpoint
CDI names + anything else: no issue

So I think we should mention that CDI EL resolution of dotted name is not portable and can lead to unpredictable behavior (don't think the namespace impl is better than the opposite so let's keep it unspecified and mention it is better to use not dotted names for EL).

mkouba commented 1 year ago

and if magic.golden has a property fish spec is clear that it must work (EL) but it will not behave as expected when you will resolve your produced string named magic.golden.fish.

Not sure what you mean. It should work. If there are beans magic.golden and magic.golden.fish => deployment problem. If there is only magic.golden bean and this bean has a property fish it will be handled by a different ELResolver (BeanELResolver most probably).

In any case, I do agree that using dots in a bean name is not a good idea and should be avoided.

rmannibucau commented 1 year ago

Hmm, I can understand the intent but not as the spec is written today @mkouba I think:

the bean name of one bean is of the form x.y, where y is a valid bean name, and x is the bean name of the other bean

so means it would be wrongly written and intend would be "y is a valid name" - but valid against which rule, here again it is quite ambiguous - cause currently it means x and y are bean names which is not previous example.

anyway, even this rule does not make sense cause outside EL there is no reason to not allow it so I think we should revise the naming, either forbid dots or have a mapping in EL integration (like supporting escaping or any specific marker CDI could interpret as being a name, even using a function instead of using the name directly would be more robust : jakarta-cdi.lookup("my.bean.name")).

Ladicek commented 1 year ago

EL + CDI -> does not work together reliably on a spec standpoint

But that's just wrong. The CDI specification demands that CDI implementations provide an ELResolver that follows the CDI rules. Among others, this includes the ability to use hierarchical bean names. If there's a CDI implementation that doesn't provide the ELResolver that supports hierarchical bean names properly, that's a fault of this implementation, not the specification.

I wasn't there when CDI 1.0 was written, but I believe it's fairly safe to assume that the people who were there actually knew what they were doing.

(For the record, I'm not a huge fan of how bean names work, and hierarchical bean names are just weird. That doesn't mean there's something wrong with the specification.)

rmannibucau commented 1 year ago

@Ladicek you lost me, @mkouba just said the intent is to not support hierarchical names (even if I agree this is not written like that in the spec) and the fact it does not work together is not about CDI ELResolver but the EL integration with CDI. Don't get me wrong, CDI ELResolver used alone is fine but used with other ELResolver (including common ones) it is not as explained before.

Was already the case in CDI 1.0 but got worse in 2 (or 1.1 no more sure) with the javax conversation naming which got some misintegration with other javax names (jakarta now).

I'm sure we can add validation rules which would enable it to work rephrasing clearly the spec but then we would also impact users so if we do I think we should maybe rethink it differently.

Side note: this still means the related code and TCK are not valid.

mkouba commented 1 year ago

@mkouba just said the intent is to not support hierarchical names

FTR I did not say this, I merely said that I think that dots should not be used in bean names (my personal opinion, has nothing to do with the spec rules and EG intentions).

rmannibucau commented 1 year ago

ok ok, the "x.y" example says it then (but I didn't find it in the spec so clearly).

jakartaee / cdi

Dotted names in CDI conflicts with EL specification #669