valtech / aem-easy-content-upgrade

AEM Easy Content Upgrade simplifies content migrations in AEM projects
Other
61 stars 25 forks source link

Method on the builder to disable logging #233

Closed pcastelog closed 2 months ago

pcastelog commented 2 months ago

When a huge migration is performed should be possible to disable the logging to avoid the creation of one line per node change.

The issue found was that after 5 million changes on nodes, the output of the file was stored in /var/aecu with a size of ~ 550Mb that was causing

org.apache.jackrabbit.oak.plugins.index.lucene.LuceneDocumentMaker String length: 555735990 for property: runOutput at Node: /var/aecu/2024/5/27/171680290585741449/1 is greater than configured value 102400

That was killing the index update since the IndexWriter could not manage that amount of data on the node.

pcastelog commented 2 months ago

This only happens when the script is executed via hook, since it's the only possibility to store the result in a node

nhirrle commented 2 months ago

Hi pcastelog

I had the issue as well with plain groovy when result gets stored in the history node. For the time being I suggest to not printing so much to the console. Instead you could write a method to store the result in a file in dam (we did this with CSVs in the past)

And as an improvement we could log it instead to a file (binary data) rather than to a String property when it exceeds a certain size and in the string property we just add a reference.

pcastelog commented 2 months ago

thanks @nhirrle

The problem I see is that if we can use plain aecu methods there is no option to not log the nodes modified, unless we do a customAction. i.e aecu.contentUpgradeBuilder() .forResourcesByPropertyQuery("/content/", Collections.singletonMap("sling:resourceType", "<resourceType>"), "nt:unstructured") .doDeleteProperty("myProperty") .run() Maybe an option will be to create a disableLog selector, that will only store the status (RUNNING, FAIL, SUCCESS) but not the actual changes. For example, when removing old properties usually just need to know if went well or not, and log in which resource failed but not the entire trace.

nhirrle commented 2 months ago

I see @pcastelog https://github.com/valtech/aem-easy-content-upgrade/blob/master/core/src/main/java/de/valtech/aecu/core/history/HistoryUtil.java#L360 seems to be the issue, the output could be too long. here we can simply make a check prior setting it and potentially cut the data and store all of it to a binary property

pcastelog commented 2 months ago

234 Added PR to solve the issue :)