jamessimone / apex-rollup

Fast, configurable, elastically scaling custom rollup solution. Apex Invocable action, one-liner Apex trigger/CMDT-driven logic, and scheduled Apex-ready.
MIT License
210 stars 30 forks source link

Full Recalculation Rollup not calculating SUM correctly of all child records #500

Closed baobao917 closed 10 months ago

baobao917 commented 11 months ago

I submitted a prior issue #491 which may/may not be similar/related to this bug issue. It is similar in that the rollup calculation is incorrect when performing a full recalculation rollup from the Recalculate Rollup tab. image

I recorded a video here: https://www.loom.com/share/234985f295e34efe80f89d9382a5a4e9?sid=c5d15090-b303-4303-852f-9d7388eea1c3

The EXPECTED results should have been: image

The ACTUAL incorrects results were: image

Here are the debug logs (zip file) and the record Id of the record above with the incorrect results Account Record Id = 0011C00002cfTvJQAU

Debug Logs.zip

Please let me know if you need any additional information.

baobao917 commented 11 months ago

If it'll help, I have read the documentation around Nebula logger and the nebula logger plug-in and plan on installing.

jamessimone commented 11 months ago

Thanks so much for providing all this detail. I will keep you updated.

jamessimone commented 11 months ago

@baobao917 I've managed to reproduce this issue after a lot of testing. The logs - and the video - were absolutely instrumental in helping me to understand where things are currently going wrong when the recalc is started from the app. I will continue to keep you updated.

baobao917 commented 11 months ago

@baobao917 I've managed to reproduce this issue after a lot of testing. The logs - and the video - were absolutely instrumental in helping me to understand where things are currently going wrong when the recalc is started from the app. I will continue to keep you updated.

@jamessimone Thanks. By the way, I've got Nebula logger and the plug in installed for Apex rollup. I see it's created a bunch of logs and log entries (although I still am pretty unfamiliar with the tool and not sure what it all means), I'm more than happy to give you any information from that or if you need access to review them, I can provide that as well.

jamessimone commented 11 months ago

@baobao917 I should have all the info I need, thank you! I was hoping to wrap this one up before the weekend but I'm afraid the fix will have to wait till early next week to be released. Thank you again for all of your help with this, and for your patience as well.

baobao917 commented 11 months ago

@jamessimone I installed the update but it doesn't appear to have corrected the problem. Here is a video. Pls let me know what else you need from me. Also, do I need to open a new issue? This one was closed but I can't seem to reopen.

https://www.loom.com/share/bf8d917af2fc4083bafabc5c56f336e8?sid=874b6bd4-03f3-4025-8262-51f1b76b170c

Also here's the error msg from the 1st time I ran it (which didn't happen the 2nd time in the video).

Sandbox

Apex script unhandled exception by user/organization: 0053w000009MwBm/00D3C0000005XfC Source organization: 00D15000000NUdR (null) Failed to process batch for class 'RollupFullBatchRecalculator' for job id '7073C0000368ohT'

caused by: System.LimitException: Apex CPU time limit exceeded

Class.webm.WebmergeTriggerEvaluator.executeUpdateTriggerForBatch: line 130, column 1 Class.webm.WebmergeTriggerEvaluator.doTrigger: line 32, column 1 Trigger.webm.webmergeAccountTrigger: line 6, column 1

jamessimone commented 11 months ago

@baobao917 so sorry you're still experiencing issues, but it's exciting to hear that you have Nebula installed now, as that should make searching through the logs for this specific parent a lot easier. What I observed from the last few times you sent logs over were these lines in several logs (these are excerpts from the logs you sent over using 1.5.85):

DEBUG|Rollup v1.5.85: lookup record prior to rolling up:
{
  "attributes" : {
    "type" : "Account",
    "url" : "/services/data/v58.0/sobjects/Account/0011C00002cfTvJQAU"
  },
  "Id" : "0011C00002cfTvJQAU"
}

And then:

DEBUG|Rollup v1.5.85: lookup record after rolling up:
{
  "attributes" : {
    "type" : "Account",
    "url" : "/services/data/v58.0/sobjects/Account/0011C00002cfTvJQAU"
  },
  "Id" : "0011C00002cfTvJQAU",
  "Open_A_P_Balance_Aged_90_Days__c" : 0.10
}

Then, because this account has more than 500 children, it appears in another log: (once again using the rollup field from the above log):

DEBUG|Rollup v1.5.85: lookup record after rolling up:
{
  "attributes" : {
    "type" : "Account",
    "url" : "/services/data/v58.0/sobjects/Account/0011C00002cfTvJQAU"
  },
  "Open_A_P_Balance_Aged_90_Days__c" : 0.0,
  "Id" : "0011C00002cfTvJQAU",
  "Open_A_P_Balance_Aged_61_90_Days__c" : 0.00,
  "Total_Open_A_P_Balance__c" : -99.90,
  "Open_A_P_Balance_Aged_0_60_Days__c" : -100.00
}

So, by the time the parent has appeared in two log files, the value being rolled up was already wrong; it had been incorrectly cleared. The fix that I added in #503 should have addressed this by changing the way the field value tracking is performed across multiple batches. Clearly that hasn't proved to be the case. I'd be willing to bet that if you searched for that parent Id, 0011C00002cfTvJQAU within the log entries generated by the full recalculations you performed, you'd see them across three different logs (as that particular parent has > 1000 children; so, three times the batch size).

On the other hand - since now some of the values are calculating correctly, and all of your rollups are SUM-based rollups, it is a bit odd that these inconsistencies are continuing to appear. I'm going to try a full integration tests using a setup more similar to your own to try to track down the issue here - in the meantime, if you do end up finding any of those relevant log entries in Nebula Logger, let me know. Again, I appreciate your patience as we work together to resolve this issue for you.

baobao917 commented 11 months ago

Hi @jamessimone - I'm not sure I understood everything you were saying, but I went ahead and re-ran the process with nebula debugging turned on. I then did a search of that parent Id on the log entries created. This is what I got. image

Would it be easier if I gave you access to my sandbox org?

Also, this time got an apex cpu time limit email image

jamessimone commented 11 months ago

judging by what you showed me from the loom video you uploaded last, those CPU timeouts are being inconsistently caused by the update statements (by the accounts being updated). I would try updating the Max Parent Rows Updated At Once value down to something like 250 (half the current amount of the batch chunk size, in other words), as that should help a bit.

I don't think I'll need sandbox access to fix this issue, but I will keep you updated

jamessimone commented 11 months ago

@baobao917, let's try this again with #506 - the new package version (which will be promoted and available for installation in production orgs momentarily) may address the other issue you were seeing here.

baobao917 commented 11 months ago

@jamessimone That did the trick! Thanks for your assistance! Becoming a big fan of this app.

jamessimone commented 11 months ago

Awesome - glad to hear it. Apologies that took me a while to track down - it was due to a fairly obscure part of the code that it doesn't seem like anybody's had a problem with before! Quite the edge case, but I'm glad to have finally fixed it.

baobao917 commented 11 months ago

@jamessimone I found another example of a bug here where it isn't calculating correctly. Different org but same type of rollup calculation and happening with the full rollup calculation from the Recalculate Rollup tab. Should I open a new ticket? Or reopen this one with the details?

I recorded a video: https://www.loom.com/share/6079bc80d8554cc4a8c9a01a80cc4377?sid=3271f11d-e394-44e0-9cdf-f9310dbfb016

baobao917 commented 11 months ago

@jamessimone I just replied but deleted the reply. If you received in email, you can disregard. I got confused referring to one of your earlier replies. I see you reopened the issue so I'm presuming there's no need to open a new issue for this bug in my latest comment.

jamessimone commented 11 months ago

@baobao917 you're right in saying that the record count shouldn't matter when it comes to performing the full recalculations. Clearly though, there's still some sort of issue. This might actually be a time where it would be easier for me to diagnose in the sandbox so you don't have to continually be waiting for full recalcs to conclude.

baobao917 commented 11 months ago

@jamessimone Sure. I can give you access to the sandbox. How do you want me to do this?

jamessimone commented 11 months ago

Feel free to use my salesforce email: james.simone@salesforce.com

jamessimone commented 10 months ago

@baobao917 I've finally had the chance to re-investigate this issue over the past few days. I really appreciate your patience, as I had a few other in-flight updates I was trying to get out as part of #520. I've validated some changes in your sandbox that appear to be leading to the proper results. I'll be including these changes as well in the hopes that we'll be closing this one out for good

baobao917 commented 10 months ago

Thanks @jamessimone ! I've confirmed that v1.6.0 is calculating correctly the full recalculation from the Recalculate Rollup tab

jamessimone commented 10 months ago

Awesome - glad to hear it!

baobao917 commented 10 months ago

@jamessimone I spoke too soon. I just did a full recalculation from the Recalculate Rollup tab for a grandparent rollup and having inconsistent results where many grandparents do not have the correct SUM of grandchildren records. Do you want me to include details as a continuation of this ticket or do you want me to open a new ticket?

jamessimone commented 10 months ago

Let's open a new item for that. The grandparent side of the system works a bit differently than direct parent parts of the codebase, and given the other issues you have open on that front at the moment I'd expect them all to be related

baobao917 commented 10 months ago

@jamessimone - I had to reopen this issue. When we closed it, we had only validated against one account. I went thru a similar exercise as I did in #523 to validate the full recalculation of the rollup on all parent accounts. I came up with many exceptions.

I tested the full recalculation from the Recalculate Rollup tab on a standard parent rollup. I then ran a report summing the child record amounts and compared them against the parent actual rollup calculated amount. Some did calculate correctly, but a fairly large number did not. See below the details:

Video - https://www.loom.com/share/7bcd529ddcf7474187ec180b584bed75?sid=a3b3e058-b48f-40c9-bef3-f63a07cf3257 From video - Exported report from sandbox (excel) - Account Total Open AR Billing QA calculation.xlsx

I believe you have access to the sandbox org still. I created the same report in sandbox (as shown in video). Link is below https://serviceone--uat.sandbox.lightning.force.com/lightning/r/Report/00OO10000004wtRMAQ/view

jamessimone commented 10 months ago

@baobao917 as somebody who used to work 8 hours a day in Excel - let me just thank you again for the high quality videos you put together. This issue, in all likelihood, is a bit different from #523 (as mentioned previously) due to differences in the grandparent rollup code path. I have a bit of time tonight to look into both of these, thank you again for all of the work you put into documenting these issues.

jamessimone commented 10 months ago

@baobao917 is there any chance that the report and xlsx you linked to are coming from a different org? I just spot checked the first 7 accounts with discrepancies and in each case the value reported by Apex Rollup appears to be correct:

Parent Id | Apex Rollup value | Report value | Diff -- | -- | -- | -- 0014w00003cpHePAAU | 8.25 | 588.50 | (580.25) 0014w00003cpMaCAAU | 23.49 | 46.98 | (23.49) 0014w00003dIfebAAC | 0.00 | 107.62 | (107.62) 0014w00003dIfeiAAC | 59.82 | 179.45 | (119.63) 0014w00003dIfelAAC | 123.92 | 185.87 | (61.95) 0014w00003dIffBAAS | 0.00 | 83.36 | (83.36) 0014w00003dIffEAAS | 0.00 | 136.75 | (136.75)
baobao917 commented 10 months ago

@jamessimone Yes, they are coming from the same org. Pls see video below.

https://www.loom.com/share/68f702fe960543f49332dce2e6cdfcd3?sid=ea0dff8e-7b16-4a31-8ff3-d88fd99a2959

jamessimone commented 10 months ago

Do you have any insight into the records pulled by the report and the records I spot-checked, then?

baobao917 commented 10 months ago

Do you have any insight into the records pulled by the report and the records I spot-checked, then?

@jamessimone I'm not sure I follow. In my video, I showed that values per my report vs the excel file are the same values. (meaning they have the same values that don't agree to each other).

Would it be easier to do a screenshare/call to clarify? I'll be online for another 1/2 hour.

jamessimone commented 10 months ago

Sorry, I had shut down for the night. It makes sense that the values in the report and the excel file are the same, because the excel is just an export of the report. However, when I spot-checked some of those same Accounts (the first seven of which I pasted above), I only found children records that matched the calculated total.

baobao917 commented 10 months ago

I double checked the first two records and the roll-up field value did not SUM the balance of all the child records. I'll check the rest of those 7 sample records tomorrow.

On Wed, Oct 25, 2023, 9:10 PM James Simone @.***> wrote:

Sorry, I had shut down for the night. It makes sense that the values in the report and the excel file are the same, because the excel is just an export of the report. However, when I spot-checked some of those same Accounts (the first seven of which I pasted above), I only found children records that matched the calculated total.

— Reply to this email directly, view it on GitHub https://github.com/jamessimone/apex-rollup/issues/500#issuecomment-1780263782, or unsubscribe https://github.com/notifications/unsubscribe-auth/AY7A4O5XVUSQ6XFRWUZC5FLYBG2BNAVCNFSM6AAAAAA463ZAS6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBQGI3DGNZYGI . You are receiving this because you were mentioned.Message ID: @.***>

jamessimone commented 10 months ago

@baobao917 it sounds like you're looking at the AcctSeed__Total__c field, but the rollup is set to run on the AcctSeed__Balance__c field. All of the discrepancies above are due to the difference between these two fields.

baobao917 commented 10 months ago

@jamessimone Ugh...sorry! Made a mistake with my report...result of multi-tasking and overworked. I reran and it all tied out. I'll reclose this issue. Sorry for the waste of time!

jamessimone commented 10 months ago

No worries whatsoever, we were both working late. I just wanted to make sure we were looking at the same thing! Thank you for the update.