jamessimone / apex-rollup

Fast, configurable, elastically scaling custom rollup solution. Apex Invocable action, one-liner Apex trigger/CMDT-driven logic, and scheduled Apex-ready.
MIT License
222 stars 30 forks source link

System.LimitException: Apex CPU time limit exceeded #354

Closed shane-saltbox closed 2 years ago

shane-saltbox commented 2 years ago

Hello @jamessimone! We have been running into a few intermittent issues in production recently, but have had a hard time nailing down where the issues was stemming from. We finally have that I believe, so wanted to raise and issue and see if we can get something figured out.

As a reminder, we are using this to roll up when we create project_forecast records to the parent of project_assignment. We have a component that is creating these two sets of records that we tested the last round. In most cases this seems to be working okay, but when we are dealing with a large amount of records we receive 'System.LimitException: Apex CPU time limit exceeded' on the flow where the rollup. Here is an example of how we'd be filling out 3 assignments for 10 weeks each, resulting in about 183 records being created across the objects, with 150 of them being project_forecast records.

Screen Shot 2022-08-18 at 9 35 26 AM

When we do this with the flow enabled and utilizing the rollup invocables the component spins for 10+ seconds and then eventually we get a flow email error with the above error repeated 150 times in the email. The flow email also states that the error is occurring on this invocable element

Screen Shot 2022-08-18 at 9 29 00 AM

I didn't think this would be a problem with the work we did last time with the bulkification. Any thoughts on what could be causing this?

I've also included the debug log we captured with this. 07L7j00000G4dr9EAB.log

jamessimone commented 2 years ago

@shane-saltbox I will take a look and get back to you!

jamessimone commented 2 years ago

@shane-saltbox In looking at the logs, I don't think that this is an issue with Apex Rollup - the entirety of the rollups being initialized and kicked off occurs in 3 seconds. You can see the summary of all of this around line 163170 of the log file:

06:55:36.186 (127674426978)|METHOD_EXIT|[8]|01p7j000000KSCk|Rollup.processStoredFlowRollups()
06:55:36.186 (127674440528)|SYSTEM_MODE_EXIT|true
06:55:36.674 (127674465396)|CUMULATIVE_LIMIT_USAGE
06:55:36.674 (127674465396)|LIMIT_USAGE_FOR_NS|(default)|
  Number of SOQL queries: 7 out of 100
  Number of query rows: 10 out of 50000
  Number of SOSL queries: 0 out of 20
  Number of DML statements: 6 out of 150
  Number of Publish Immediate DML: 0 out of 150
  Number of DML rows: 709 out of 10000
  Maximum CPU time: 3210 out of 10000
  Maximum heap size: 102849 out of 6000000
  Number of callouts: 0 out of 100
  Number of Email Invocations: 0 out of 10
  Number of future calls: 0 out of 50
  Number of queueable jobs added to the queue: 2 out of 50
  Number of Mobile Apex push calls: 0 out of 10

06:55:36.674 (127674465396)|CUMULATIVE_LIMIT_USAGE_END

Please also note that the snippet above (probably) corresponds to the Perform Deferred Rollup Calc action in your screenshot above, since (I'm assuming) that's the one that calls the action whose limit-usage I've pasted above.

After that, a number of flow interviews look to be getting kicked off on Project Assignment and Project (presumably, lower on in the flow you've screenshotted), which is where the limit exception is actually being thrown. The "rollup" part of the equation could probably be optimized a bit more, but honestly for that number of records that's really not that bad an execution time; I think the the problem is further down in the flow. For your debugging help, it seems the flow that is actually causing the error (possibly a sub-flow?) has the Id of 01I7j0000009phz. Let me know if there's anything else I can do to assist, I'm happy to keep debugging further.

shane-saltbox commented 2 years ago

Hey @jamessimone, thanks for the response. That last node you see in the flow screenshot is the last node in the flow, there aren't any other actions or sub-flows being executed. When I disable the flow everything executes as expected, so it's definitely the flow causing the issue.

To further verify this, I created a new version of the flow without the two rollup nodes and ran the same test and everything executed as expected. (Attached the log as well) apex-07L7j00000G4tMsEAJ.log

Screen Shot 2022-08-22 at 12 08 13 PM

When I re-activate the previous version with those two rollup nodes the above noted issue occurs. Here's another log from when I add those nodes back. apex-07L7j00000G4tOPEAZ.log

jamessimone commented 2 years ago

@shane-saltbox do you have other flows operating on those same records, though? I'll also go back through #321 to ensure I'm fully back up to speed on your object model / the way you're running Apex Rollup, but at least at face value what I'm seeing would suggest that you have 7 seconds being spent in other automations, such that the addition of the ~3s of Apex Rollup into the equation causes failures since the sync limit of 10 seconds for CPU time is being exceeded.

shane-saltbox commented 2 years ago

@jamessimone, nope, this is the only flow attached to the forecast object. Here a screenshot of flow trigger explorer (this is the same for update) Screen Shot 2022-08-22 at 1 09 06 PM

Are you seeing the same 7 seconds in 'other automations' on the last log I sent you without the rollup included? I didn't see that, it was executing really quick without those nodes in the flow.

jamessimone commented 2 years ago

@shane-saltbox in the log you sent me without the rollup included, I see the following (timestamp - description):

I am working on the Apex Rollup side of things to bring the CPU time down a bit from where it stands (I've created a beta package that you can test with which may help, and I will continue to work on optimizing the CPU usage particularly when Apex Rollup is called from Flow).

My point in writing out all of the log excerpts above, though, is to show why adding three seconds of execution time to the processes you already have in place is a problem - you're already consuming over 7 seconds of CPU time prior to calling Apex Rollup, and the vast majority of that time is spent in other flows.

shane-saltbox commented 2 years ago

@jamessimone, thanks again for your help on this. I think we have a solve after several different tests.

What we decided to do was to move the computation we were doing in the flow into the main components themselves, this will require a bit of work for us to make those updates but it will remove the additional transactions needed on all the forecast records after the initial save. Additionally, we moved the rollupapex call into a very simple trigger. This seems to handle the large amount of records much better than the flows. In our testing, the flow errored out due to CPU limits well before the trigger did so.

Thanks again for your help! I think these steps should resolve the issue, but after making them i'll let you know if anything else comes up with this.

jamessimone commented 2 years ago

@shane-saltbox awesome. please let me know if there's anything else I can do to help - and feel free to re-open this issue (or create something more specific) if there's more work here to be done. I can promise that v1.5.20 will feature additional optimizations for some of these code paths, as well.