Open ghenzler opened 9 months ago
PRs are always appreciated in open source!
changing from true
to false
is easy - the question is more if everybody is happy with generating the xls file directly at the end so it does not have to be regenerated on every xlsx-download? (maybe there was reasons it was done the way it is today, before I create a PR for that I want to hear that the "MCP maintainers" are fine with it...)
MCP uses Apache POI to build xlsx which is a memory hog. Generating a report with 80K rows can take quite a bit of time and memory, and even potentially result in OOM). Maybe we should make the output format configurable and allow user to choose between xlsx and csv. So there are two potential improvements:
true
=> false
will reduce the execution time
xlsx
=> csv
will reduce the download time
I think it's ok to use xlsx, it's just more convenient to open & distribute (as the reports are often sent to business people from my experience). I think there is little reason to not change from true
to false
to speed up the creation of the JCR nodes significantly.
My question is mostly if somebody objects to create the xlsx report immediately along with the JCR nodes, so on download the existing binary can just be quickly spooled out.
My question is mostly if somebody objects to create the xlsx report immediately along with the JCR nodes, so on download the existing binary can just be quickly spooled out.
it should be perfectly fine to create the report as a step in the MCP job. Ideally it would be backwards compatible, i.e. if the report is pre-built (new mode) then download it, otherwise create it on the fly (current behavior).
Required Information
Expected Behavior
storeReport() for large reports containing ~80,000 rows should execute in a reasonable timeframe (seconds or at most minutes)
Actual Behavior
Saving a large report takes 90min locally
Steps to Reproduce
Create an MCP instance with ~80,000 rows and wait ;-)
The reason for the slow performance is that https://github.com/Adobe-Consulting-Services/acs-aem-commons/blob/2daabf3d98c9c6c7bcc6b6ffbca5df22ae4d6407/bundle/src/main/java/com/adobe/acs/commons/mcp/model/GenericReport.java#L88 has
true
as last argument (autoCommit).So what I did to solve the problem: I created a local version of
com.adobe.acs.commons.mcp.model.GenericReport.persist(ResourceResolver, String)
and call it fromcom.adobe.acs.commons.mcp.ProcessDefinition.storeReport(ProcessInstance, ResourceResolver)
.Als creating the Xls on the fly can sometimes time out in the cloud service, so I'm currently creating it at
/var/acs-commons/mcp/instances/<id>/jcr:content/mcp-report.xlsx
and I have a filter in place to just redirect to that location instead of generating it on the fly.So at the very minimum ResourceUtil.getOrCreateResource() should not be called with
true
(brings down the execution time from 90min to 1min). I'm happy to create a PR with this change and the XLS-generation done at the end of the report if desired.