adobe / aem-project-archetype

Maven template to create best-practice websites on AEM.
https://experienceleague.adobe.com/docs/experience-manager-core-components/using/developing/archetype/overview.html
Apache License 2.0
547 stars 421 forks source link

AEM instance repository getting corrupted after archetype 39 code deployment #997

Closed shakuntshri closed 1 year ago

shakuntshri commented 1 year ago

Expected Behaviour

AEM instance should restart wihtout any error.

Actual Behaviour

Getting error - AuthenticationSupport service missing. Cannot authenticate request.

Reproduce Scenario (including but not limited to)

Steps to Reproduce

Platform and Version

AEM 6.5

Sample Code that illustrates the problem

AEM archetype-39,

Note: that this issue is occurring for other archetype as well such asarchetype-34

Logs taken while reproducing problem

krystian-panek-vmltech commented 1 year ago

I am also observing this. Instance is broken when I deploy fresh app just generated from AEM archetype after doing machine reboot. When I reboot machine without installing app from archetype everything works fine.

AEM version: aem-sdk-2022.10.9398

Please share the info how to fix the problem. It's pretty hard to diagnose the problem as even /system/console is not accessible and it's hard to identify what bundles/components are not up.

PS. Latest WKND app works fine.

Greetings, Krystian

krystian-panek-vmltech commented 1 year ago

After generating project from archetype using copy-pasted command from README.MD

mvn -B org.apache.maven.plugins:maven-archetype-plugin:3.2.1:generate \
 -D archetypeGroupId=com.adobe.aem \
 -D archetypeArtifactId=aem-project-archetype \
 -D archetypeVersion=39\
 -D appTitle="My Site" \
 -D appId="mysite" \
 -D groupId="com.mysite"

it is needed to manually remove/correct the line:

https://github.com/adobe/aem-project-archetype/blob/develop/src/main/archetype/ui.config/src/main/content/jcr_root/apps/__appId__/osgiconfig/config/org.apache.sling.jcr.repoinit.RepositoryInitializer~__appId__.cfg.json#L5

"set properties on /content/dam/${appId}/jcr:content\n  set cq:conf{String} to /conf/${appId}\n  set jcr:title{String} to \"${appTitle}\"\nend"

I just confirmed that removing that line is causing that after installing then rebooting AEM instance is working as expected again.

I am still not sure what is the root cause - maybe set properties clause is still not supported on recent SDK or sth else. Could somebody knowing potential cause help me with figuring out how it should fixed properly (instead of removing a line)? I am aware that it was added for some reason...

krystian-panek-vmltech commented 1 year ago

I reckon this issue is very harmful for AEM devs and hard to diagnose so it should be relatively quickly fixed.

so maybe... @vladbailescu WDYT?

jamiecounsell commented 1 year ago

This cost me a few hours yesterday, all my CloudManager builds were failing and I think this is why. CM builds had no logs, but instance logs had the error @krystian-panek-wttech seems to be experiencing:

Caused by: javax.jcr.nodetype.ConstraintViolationException: No matching property definition: cq:conf = /conf/my-site
jamiecounsell commented 1 year ago

Just confirmed removing this line resolved the CloudManager builds. Here is the full stack trace from the instance logs when it was failing:

13.12.2022 23:52:04.903 [cm-pxxxxx-exxxxxx-aem-author-xxxxxxxxxx-xxxx] *ERROR* [Apache Sling Repository Startup Thread #1] com.adobe.granite.repository.impl.SlingRepositoryManager Exception in a SlingRepositoryInitializer, SlingRepository service registration aborted
javax.jcr.RepositoryException: Applying repoinit operation failed despite retry; set loglevel to DEBUG to see all exceptions. Last exception message was: Unable to set properties on path [/content/dam/my-site/jcr:content]:javax.jcr.nodetype.ConstraintViolationException: No matching property definition: cq:conf = /conf/my-site
    at org.apache.sling.jcr.repoinit.impl.RepositoryInitializerFactory.applyOperations(RepositoryInitializerFactory.java:154) [org.apache.sling.jcr.repoinit:1.1.39.T20220426093723-a4cd7db]
    at org.apache.sling.jcr.repoinit.impl.RepositoryInitializerFactory.processRepository(RepositoryInitializerFactory.java:130) [org.apache.sling.jcr.repoinit:1.1.39.T20220426093723-a4cd7db]
    at org.apache.sling.jcr.base.AbstractSlingRepositoryManager.executeRepositoryInitializers(AbstractSlingRepositoryManager.java:627) [org.apache.sling.jcr.base:3.1.10]
    at org.apache.sling.jcr.base.AbstractSlingRepositoryManager.initializeAndRegisterRepositoryService(AbstractSlingRepositoryManager.java:575) [org.apache.sling.jcr.base:3.1.10]
    at org.apache.sling.jcr.base.AbstractSlingRepositoryManager.access$300(AbstractSlingRepositoryManager.java:96) [org.apache.sling.jcr.base:3.1.10]
    at org.apache.sling.jcr.base.AbstractSlingRepositoryManager$4.run(AbstractSlingRepositoryManager.java:544) [org.apache.sling.jcr.base:3.1.10]
Caused by: org.apache.sling.jcr.repoinit.impl.RepoInitException: Unable to set properties on path [/content/dam/my-site/jcr:content]:javax.jcr.nodetype.ConstraintViolationException: No matching property definition: cq:conf = /conf/my-site
    at org.apache.sling.jcr.repoinit.impl.DoNothingVisitor.report(DoNothingVisitor.java:66) [org.apache.sling.jcr.repoinit:1.1.39.T20220426093723-a4cd7db]
    at org.apache.sling.jcr.repoinit.impl.NodePropertiesVisitor.visitSetProperties(NodePropertiesVisitor.java:217) [org.apache.sling.jcr.repoinit:1.1.39.T20220426093723-a4cd7db]
    at org.apache.sling.repoinit.parser.operations.SetProperties.accept(SetProperties.java:44) [org.apache.sling.repoinit.parser:1.6.14]
    at org.apache.sling.jcr.repoinit.impl.JcrRepoInitOpsProcessorImpl.apply(JcrRepoInitOpsProcessorImpl.java:56) [org.apache.sling.jcr.repoinit:1.1.39.T20220426093723-a4cd7db]
    at org.apache.sling.jcr.repoinit.impl.RepositoryInitializerFactory.lambda$applyOperationInternal$0(RepositoryInitializerFactory.java:170) [org.apache.sling.jcr.repoinit:1.1.39.T20220426093723-a4cd7db]
    at org.apache.sling.jcr.repoinit.impl.RetryableOperation.apply(RetryableOperation.java:62) [org.apache.sling.jcr.repoinit:1.1.39.T20220426093723-a4cd7db]
    at org.apache.sling.jcr.repoinit.impl.RepositoryInitializerFactory.applyOperationInternal(RepositoryInitializerFactory.java:168) [org.apache.sling.jcr.repoinit:1.1.39.T20220426093723-a4cd7db]
    at org.apache.sling.jcr.repoinit.impl.RepositoryInitializerFactory.applyOperations(RepositoryInitializerFactory.java:150) [org.apache.sling.jcr.repoinit:1.1.39.T20220426093723-a4cd7db]
    ... 5 common frames omitted
Caused by: javax.jcr.nodetype.ConstraintViolationException: No matching property definition: cq:conf = /conf/my-site
    at org.apache.jackrabbit.oak.jcr.delegate.NodeDelegate.setProperty(NodeDelegate.java:514) [org.apache.jackrabbit.oak-jcr:1.44.0.T20220909141113-2457ffc]
    at org.apache.jackrabbit.oak.jcr.session.NodeImpl$36.perform(NodeImpl.java:1404) [org.apache.jackrabbit.oak-jcr:1.44.0.T20220909141113-2457ffc]
    at org.apache.jackrabbit.oak.jcr.session.NodeImpl$36.perform(NodeImpl.java:1391) [org.apache.jackrabbit.oak-jcr:1.44.0.T20220909141113-2457ffc]
    at org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.perform(SessionDelegate.java:210) [org.apache.jackrabbit.oak-jcr:1.44.0.T20220909141113-2457ffc]
    at org.apache.jackrabbit.oak.jcr.session.ItemImpl.perform(ItemImpl.java:112) [org.apache.jackrabbit.oak-jcr:1.44.0.T20220909141113-2457ffc]
    at org.apache.jackrabbit.oak.jcr.session.NodeImpl.internalSetProperty(NodeImpl.java:1391) [org.apache.jackrabbit.oak-jcr:1.44.0.T20220909141113-2457ffc]
    at org.apache.jackrabbit.oak.jcr.session.NodeImpl.setProperty(NodeImpl.java:384) [org.apache.jackrabbit.oak-jcr:1.44.0.T20220909141113-2457ffc]
    at org.apache.sling.jcr.repoinit.impl.NodePropertiesVisitor.setNodeProperties(NodePropertiesVisitor.java:197) [org.apache.sling.jcr.repoinit:1.1.39.T20220426093723-a4cd7db]
    at org.apache.sling.jcr.repoinit.impl.NodePropertiesVisitor.visitSetProperties(NodePropertiesVisitor.java:214) [org.apache.sling.jcr.repoinit:1.1.39.T20220426093723-a4cd7db]
    ... 11 common frames omitted
kwin commented 1 year ago

It seems that the node types of /content/dam/my-site/jcr:content don't allow that property name. To me it seems that this isn't nt:unstructured (which would allow all properties) despite what is configured in https://github.com/adobe/aem-project-archetype/blob/e129dbaa4e3a38daac816f86589dd53b0c3965b0/src/main/archetype/ui.config/src/main/content/jcr_root/apps/__appId__/osgiconfig/config/org.apache.sling.jcr.repoinit.RepositoryInitializer~__appId__.cfg.json#L4. This is due to the fact that this is a NOOP in case the path does already exist (https://github.com/apache/sling-org-apache-sling-jcr-repoinit/blob/master/src/main/java/org/apache/sling/jcr/repoinit/impl/AclVisitor.java#L191). Please check before the restart what node type the node /content/dam/my-site/jcr:content has.

kwin commented 1 year ago

I created https://issues.apache.org/jira/browse/SLING-11736 to improve the behaviour in the future.

kwin commented 1 year ago

The primary type is changed, whenever the module created from https://github.com/adobe/aem-project-archetype/tree/develop/src/main/archetype/ui.content is being deployed, as https://github.com/adobe/aem-project-archetype/blob/develop/src/main/archetype/ui.content/src/main/content/jcr_root/content/dam/__appId__/.content.xml doesn't define an explicit node type for /content/dam/my-site/jcr:content and due to https://github.com/adobe/aem-project-archetype/tree/develop/src/main/archetype/ui.content/src/main/content/jcr_root/content/dam/__appId__/_jcr_content it becomes an nt:folder (according to https://jackrabbit.apache.org/filevault/vaultfs.html#folder-aggregates).

kwin commented 1 year ago

In general it is a very bad idea to manage the same nodes both via content package and repoinit. Otherwise you easily run into this issue (in the worst case) or alternating node types between start and content deployment (in the best case)

krystian-panek-vmltech commented 1 year ago

thanks, @kwin for clarification.

still, could we do anything in the archetype itself to mitigate the dangerous issue we have now? you know, I am not sure if it is a good idea to wait for the fix available first in Sling, then in AEM because, in the meantime, so many users could be affected and unnecessarily waste yet more and more hours on this.

kwin commented 1 year ago

@krystian-panek-wttech Setting the right primaryType for jcr:content in the file created from https://github.com/adobe/aem-project-archetype/blob/develop/src/main/archetype/ui.content/src/main/content/jcr_root/content/dam/__appId__/.content.xml should be sufficient. However I really think that Adobe should come up with a fix here.

krystian-panek-vmltech commented 1 year ago

thanks, @kwin ; for your all effort put around to improve the developer experience here

vladbailescu commented 1 year ago

@adobe export issue to Jira project SITES

vladbailescu commented 1 year ago

Thanks @kwin & @krystian-panek-wttech for confirming where the problem was, please review #1021!