eclipse-passage / passage

Define and control license checks and usage constraints for OSGi / RCP / IDE
https://www.eclipse.org/passage/
Eclipse Public License 2.0
6 stars 8 forks source link

passage.loc-based product fails on startup #1364

Closed HannesWell closed 1 month ago

HannesWell commented 1 month ago

In the current master build starting an Eclipse application that contains the bundle org.eclipse.passage.loc.licenses.core or org.eclipse.passage.loc.products.core fails with the error below and the LicenseDomainRegistry, ProductDomainRegistry and ProductOperatorServiceImpl cannot be obtained as OSGi service. As the error indicates, their DS component description files need to reference scr-v1.3.0 instead of only 1.1.0.

org.osgi.service.component.ComponentException: Component org.eclipse.passage.loc.internal.licenses.core.LicenseDomainRegistry validation failed: Field reference requires DS >= 1.3
    at org.apache.felix.scr.impl.metadata.ComponentMetadata.validationFailure(ComponentMetadata.java:1101)
    at org.apache.felix.scr.impl.metadata.ReferenceMetadata.validate(ReferenceMetadata.java:729)
    at org.apache.felix.scr.impl.metadata.ComponentMetadata.validate(ComponentMetadata.java:994)
    at org.apache.felix.scr.impl.BundleComponentActivator.validateAndRegister(BundleComponentActivator.java:445)
    at org.apache.felix.scr.impl.BundleComponentActivator.loadDescriptor(BundleComponentActivator.java:407)
    at org.apache.felix.scr.impl.BundleComponentActivator.initialize(BundleComponentActivator.java:283)
    at org.apache.felix.scr.impl.BundleComponentActivator.<init>(BundleComponentActivator.java:218)
    at org.apache.felix.scr.impl.Activator.loadComponents(Activator.java:612)
    at org.apache.felix.scr.impl.Activator.access$200(Activator.java:75)
    at org.apache.felix.scr.impl.Activator$ScrExtension.start(Activator.java:480)
    at org.apache.felix.scr.impl.AbstractExtender.createExtension(AbstractExtender.java:196)
    at org.apache.felix.scr.impl.AbstractExtender.modifiedBundle(AbstractExtender.java:169)
    at org.apache.felix.scr.impl.AbstractExtender.addingBundle(AbstractExtender.java:139)
    at org.apache.felix.scr.impl.AbstractExtender.addingBundle(AbstractExtender.java:49)
    at org.osgi.util.tracker.BundleTracker$Tracked.customizerAdding(BundleTracker.java:477)
...
eparovyshnaya commented 1 month ago

Indeed, when Passage Licensing Operator starts, it reports

org.osgi.service.component.ComponentException: Component org.eclipse.passage.loc.internal.licenses.core.LicenseDomainRegistry validation failed: Field reference requires DS >= 1.3

and

org.osgi.service.component.ComponentException: Component org.eclipse.passage.loc.internal.products.core.ProductOperatorServiceImpl validation failed: Field reference requires DS >= 1.3

which causes majour disfunction of the product.

Moving these two components' xml declarations xmlns:scr namespace to v1.3.0 indeed fixes the failure. That was quite an illustrative experiment, which leaves us with several open questions, which must be addressed before proper solution can be proposed:

Investigation is in progress.

HannesWell commented 1 month ago
  • what exactly causes these failures

The change from https://github.com/eclipse-passage/passage/pull/1356 e.g. in ProductOperatorServiceImpl requires SCR 1.3 because the Field Strategy or Field Injection, i.e. using @Reference at fields, described in chapter 112.3.3 is only available since 1.3 of the OSGi DS spec (which has a different version than the overall OSGi release):

In version DS version 1.2 chapter 112.3 References to Services does not support that: https://docs.osgi.org/download/r5/osgi.cmpn-5.0.0.pdf

  • why only these two components failed to load

I assume @Reference is not on fields used anywhere else.

Btw. if you contemplate to revert the change mentioned above, my suggestion would be to just continue with #1362 to fix the SCR files since 1.3 is supported by the Felix SCR (which is used by Eclipse) since 2015: https://issues.apache.org/jira/browse/FELIX-4406 So requiring SRC 1.3 isn't a strong requirement.

On the long run I would furthermore continue with https://github.com/eclipse-passage/passage/issues/1365 to avoid such synchronization errors in total in the future.

ruspl-afed commented 1 month ago

Thank you for the detailed explanation @HannesWell I don't have such a detailed knowledge of OSGi spec and that's why it took me a lot longer to figure out why this happened.

For now I prefer to keep v1.1.0 for all the manifests for consistency, may be later we will switch to a less verbose v1.3.0

All such synchronization errors were introduced by me manually, because I was scared by the diff that has no "activate" method anymore. And I was not alone, there are many developers in many countries who were scared by this as well.

Moreover I do not think that removing OSGi component manifests is what I would like to have, I agree with @castortech : these are still sources.

HannesWell commented 1 month ago

For now I prefer to keep v1.1.0 for all the manifests for consistency, may be later we will switch to a less verbose v1.3.0

Its not about verbosity of the XML file, but about using what's available because you now had to rework the code of the implementation to be way more verbose than it has to be.

All such synchronization errors were introduced by me manually, because I was scared by the diff that has no "activate" method anymore. And I was not alone, there are many developers in many countries who were scared by this as well.

Moreover I do not think that removing OSGi component manifests is what I would like to have, I agree with @castortech : these are still sources.

But didn't not relying on the tools and trying to do/control it manually lead to this entire problem? If https://github.com/eclipse-passage/passage/issues/1365 were implemented already, probably nobody had noticed that SCR 1.3 is required now because everything would just have worked and everyone would be happy. The cause of a problem is usually not also its solution and I think it definitely isn't in this case.

The statement about submitting byte-code in the referenced issue might be exaggerated but in it's core it is right: One should trust your tools, usually they are right. Of course, like every other software they are not bug-free, but usually humans error far more often. And if you really encounter a bug it should be fixed, so it is better to use the tools more also to find bugs faster.

Anyway in the end what matters to me is that it works, how it is achieved is your decision.

ruspl-afed commented 1 month ago

I'm all for tools, but for me transition to a tool version that started to remove "activate" still does not look friendly enough. Because I have no idea regarding how this "default" will be interpreted by older Platform versions, for example. May be a comprehensive spec for this default and other defaults does exist somewhere, I don't know,

The same for removing component manifests. If I do so, I need to say for every bundle in some formal way (i.e., in a way recognizable for other tools) that it requires Tycho version X++ to be compiled, otherwise component manifests will not appear and things will not work on runtime.

HannesWell commented 1 month ago

I'm all for tools, but for me transition to a tool version that started to remove "activate" still does not look friendly enough. Because I have no idea regarding how this "default" will be interpreted by older Platform versions, for example. May be a comprehensive spec for this default and other defaults does exist somewhere, I don't know,

The default is specified as described in

If another platform behaves differently it does not comply with the spec and that platform should be fixed or one should worry what else is different there.

The same for removing component manifests. If I do so, I need to say for every bundle in some formal way (i.e., in a way recognizable for other tools) that it requires Tycho version X++ to be compiled, otherwise component manifests will not appear and things will not work on runtime.

I want to emphasize that I didn't suggest to remove the component xml files from the built jars. In https://github.com/eclipse-passage/passage/issues/1365 I suggest to remove them from SCM/git so that they are always generated in the IDE and the build. So for consumers no specific tool versions are required.

eparovyshnaya commented 1 month ago

Should be fixed with #1377 (as well as #1370 is fixed with it)

ruspl-afed commented 1 month ago

The default is specified as described in

Yes, I understand, now implementation is closer to an OSGi spec and this is definitely good. My concern is about older Eclipse Platform versions. Will they interpret an activate method as "activate" if it was not mentioned in manifest? Perhaps only experiment can answer. Another corner case introduced with removal of explicit "activate" is that I may have method with name activate in a component class [visible enough for whatever reason] and this method was not intended to be used for component activation. Will such a method be used anyway? If yes, how can I avoid this?

I want to emphasize that I didn't suggest to remove the component xml files from the built jars.

Yes, I understand this. I'm talking about other use cases we have in our practice, when people want to build Passage bundles with another valid "target". For now we have required bundles ( + imported packages) and Java source level. In the case of removal we will need to specify a kind of "target tycho version" as well. The question is how to formalize this new build requirement.

HannesWell commented 1 month ago

The default is specified as described in

Yes, I understand, now implementation is closer to an OSGi spec and this is definitely good. My concern is about older Eclipse Platform versions. Will they interpret an activate method as "activate" if it was not mentioned in manifest? Perhaps only experiment can answer.

Yes, that is required by the spec for quite some time see section 112.5.8 Activate Method for example in OSGi Service Platform Service Compendium Release 4.1. Of course in the current versions it still behaves the same if no activate attribute is specified.

Another corner case introduced with removal of explicit "activate" is that I may have method with name activate in a component class [visible enough for whatever reason] and this method was not intended to be used for component activation. Will such a method be used anyway? If yes, how can I avoid this?

Yes it will be used and AFICT there is no way to avoid this. But from the link above you can see that this was the behavior since (probably) ever and only later the possibility to specify another activate method was only added later. But if you think the spec is incomplete you can file an issue at https://github.com/osgi/osgi

I want to emphasize that I didn't suggest to remove the component xml files from the built jars.

Yes, I understand this. I'm talking about other use cases we have in our practice, when people want to build Passage bundles with another valid "target". For now we have required bundles ( + imported packages) and Java source level. In the case of removal we will need to specify a kind of "target tycho version" as well. The question is how to formalize this new build requirement.

You mean build from source? Then the pom.xml of passage already defines the Tycho version. I wouldn't expect that the build works with an older version (it can, but does not have to). And if you mean if somebody builds against passage bundles from a TP I can only repeat that the built bundles still contain the generated component's XML file. If you mean something else I didn't understand your case.