OfficeDev / office-js

A repo and NPM package for Office.js, corresponding to a copy of what gets published to the official "evergreen" Office.js CDN, at https://appsforoffice.microsoft.com/lib/1/hosted/office.js.
https://learn.microsoft.com/javascript/api/overview
Other
670 stars 95 forks source link

CPU usage spikes when including any live custom functions in a computation-heavy sheet #3668

Open paco-sparta opened 12 months ago

paco-sparta commented 12 months ago

We have developed and released an Excel add-in, and our customers have complained that our real-time behavior is choking their heavier spreadsheets.

We have investigated and found a trivial repro with the most basic implementation: just integrating one custom function that updates every second via @streaming or via range#values causes a recalculation of the whole sheet. This is not an immediate problem when developing on smaller sheets, but as soon as we went to production the problem was obvious.

We tested with other add-ins from the store and the problem is endemic. Any attempt at real-time updates puts the runtime and CPU at 75%+ use, reaching 100% and stalling Excel depending on the sheet and formula implementation.

Provide required information needed to triage your issue

Your Environment

Expected behavior

Streaming or updating values from custom functions doesn't cause a whole document to recalculate, just the cells that have been updated. This behavior could be achieved as a default or programmatically.

Current behavior

Whenever a Custom Formula is added to any large document, as long as that formula is streaming values it causes the CPU to spike.

Steps to reproduce

Basic project with test functions: one streaming, one setting via context.

https://github.com/SpartaCommodities/excel-addin-template

  1. Build add-in
  2. Inject add-in into a large document, with many sheets and formulas
  3. Add any sample function provided
  4. See that Recalcs for the whole document happen every second, spiking the CPU and stalling the sheet

Link to live example(s)

https://github.com/SpartaCommodities/excel-addin-template

Provide additional details

  1. We have tested any permutation of fewer formulas, more cells per formula, suspending calculations until sync, calculate only the range, and disabling JS events. While they make the CPU spikes smaller, they don't address the root cause.
  2. Making the update rate slower (3-5 seconds) keeps the spreadsheet usable, but fails our customer's expectations to see real-time financial data.
  3. Making AutoCalculations "Manual" mitigates the problem but also fails our customer's expectations of having live sheets that interop with other add-ins.

Context

Customers are complaining. We have reached out to VBA consultants and they told us the same APIs we used in JS, in VBA don't retrigger doc-wide calculations.

We have tested other add-ins in the store and they suffer the same problem.

Useful logs

IOU a screenshot with 240% CPU on Mac and 100% on Windows.

Thank you for taking the time to report an issue. Our triage team will respond to you in less than 72 hours. Normally, response time is <10 hours Monday through Friday. We do not triage on weekends.

ruijiaMS commented 12 months ago

Hi @paco-sparta, Thanks for letting us know this issue. @streaming function is a kind of volatile function. Trigger one volatile function will cause all the volatile in this worksheet to be calculated. Here is a reference: https://learn.microsoft.com/en-us/office/client-developer/excel/excel-recalculation#volatile-and-non-volatile-functions. So if your custom function don't need to be streaming, please avoid using streaming to declare it. And the re-calc of streaming function will actually not trigger any calculation of common functions, like SUM(). I am not very sure about what do you mean by "range#values", could you please give me more details about it so that we can understand your concern. Thanks!

paco-sparta commented 12 months ago

Thank you for the explanation on Volatile functions! We'll try to work with it :D

We have found that Excel is triggering recalculations of non-volatile functions too, which is what causes our large spreadsheets to fall over.

As for your question, we tried an alternate approach, where a non-@streaming function does updates based off the Excel.run context. One example is in: https://github.com/SpartaCommodities/excel-addin-template/blob/81f36844aecd3afe82487f1bcdd548774ec0085c/src/functions/functions.ts#L28-L44

This approach to updating cells also causes recalculations, which we were not expecting:

/**
 * Increment a value every second using Excel.run
 * @customfunction
 * @requiresAddress
 * @param invocation Custom function handler
 */
export function increment(invocation: CustomFunctions.Invocation): any[][] {
  let result = 0;

  setInterval(() => {
    Excel.run((context) => {
      const [sheetId, cellId] = invocation.address.split("!");
      const cell = context.workbook.worksheets.getItem(sheetId).getRange(cellId);

      cell.values = [[result]];
      result++;

      return context.sync();
    });
  }, 1000);

  return [[0]];
}
ruijiaMS commented 12 months ago

Thanks for sharing the code with us! I create an onCalculated event handler to catch which cell is calculating in the target worksheet, but when I running your increment() function, it seems like other built-in function like =SUM(1,2) does not be recalculated. Could you please have a try with the following code to detect which formula in cell is recalculating?


async function run() {
  await Excel.run(async (context) => {
    const worksheet = context.workbook.worksheets.getItem("Sheet1");
    worksheet.onCalculated.add(handleChange);

    await context.sync();
    console.log("Event handler successfully registered for onChanged event in the worksheet.");
  }).catch(Error);
}

async function handleChange(event) {
  await Excel.run(async (context) => {
    await context.sync();
    console.log("Address of event: " + event.address);
  }).catch(Error);
}
paco-sparta commented 12 months ago

Okay, does updating function this way is considered volatile? Is there any way of disabling it?

ruijiaMS commented 11 months ago

The setInterval() is a very common function to deploy a regular task in custom function, you can use it to update values. Volatile is a concept for some special functions in Excel and if you would like to disable them, you can implement a similar function to avoid invoking them directly.

paco-sparta commented 11 months ago

What I'm trying to ask is why when updating a function the way the example I sent you and you sent back causes all volatile functions to recalculate. I've been able to repro that one consistently with the example above.

I feel like we're addressing things around the main point but not the main point: frequent updates of volatile functions or updates like the setInterval example above can cause sheets with many volatile functions to spike CPU usage and stall.

Is there any way of preventing all these CPU-hungry updates from happening that are not setting autocalc to manual for the whole Excel, or rewriting the sheet? As add-in developers releasing on the store we're not allowed to do that.

ruijiaMS commented 11 months ago

Thank you for the explanation on Volatile functions! We'll try to work with it :D

We have found that Excel is triggering recalculations of non-volatile functions too, which is what causes our large spreadsheets to fall over.

As for your question, we tried an alternate approach, where a non-@streaming function does updates based off the Excel.run context. One example is in: https://github.com/SpartaCommodities/excel-addin-template/blob/81f36844aecd3afe82487f1bcdd548774ec0085c/src/functions/functions.ts#L28-L44

This approach to updating cells also causes recalculations, which we were not expecting:

/**
 * Increment a value every second using Excel.run
 * @customfunction
 * @requiresAddress
 * @param invocation Custom function handler
 */
export function increment(invocation: CustomFunctions.Invocation): any[][] {
  let result = 0;

  setInterval(() => {
    Excel.run((context) => {
      const [sheetId, cellId] = invocation.address.split("!");
      const cell = context.workbook.worksheets.getItem(sheetId).getRange(cellId);

      cell.values = [[result]];
      result++;

      return context.sync();
    });
  }, 1000);

  return [[0]];
}

There are some issues on this code itself. If you would like to declare a streaming function, please either add a @streaming annotation or use the invocation: CustomFunctions.StreamingInvocation. https://learn.microsoft.com/en-us/office/dev/add-ins/excel/custom-functions-web-reqs#make-a-streaming-function. Otherwise the Excel will not consider this function as a streaming function. And the setInterval() function is a special function living in streaming function. So this code phrase will out of Excel's control, might cause unexpected issues like performance issues and cannot be cancel legally.

If you really would like to implement the task like the code describing. you can: 1, invoke Excel javascript api instead of custom functions. or 2, declare a streaming custom function and please don't set value on the address of original address of input streaming function.

paco-sparta commented 11 months ago

We have shipped an add-in using @streaming functions that real customers use. We don't have problems to make it work. We have read the docs, the examples, and contacted several consultants to go through the code. We have given a minimal repro repository with both types of functions, so you understand the question rather than focusing on whether we've added a @cancellable clause.

We're failing to reach an engineer who can tell us what esoteric reason causes the examples above (once fixed with cancellation and such because those are irrelevant to perf) to make Excel retrigger all formulas (streaming, volatile, tracked, magical, logical, responsible, practical, dependable, clinical, intellectual, cynical or any other possible type) through the spreadsheet.

We only want to trigger all dependents, with the code above or any other type of code, because we're DYING ON THE POOR PERFORMANCE OF SIMPLE STREAMING FUNCTIONS in heavy documents and our customers are calling every day.

So this code phrase will out of Excel's control, might cause unexpected issues like performance issues and cannot be cancel legally.

"Might" cause "unexpected" would need an explanation. What might cause, what might not cause it? What is unexpected, what is expected? What is the scheduling algorithm for excel, how does it resolve javascript formulas? How do different JS engines work differently here?

Is there any way to escalate this? We're willing to pay for support but all the phone numbers are for non-engineering queries.

ruijiaMS commented 11 months ago

We strongly recommend to use the standard ways to declare and use the streaming functions including special defined functions in streaming. Otherwise these non-standard functions could only be consider as general custom functions and hardly be stopped in right way.

Triggering recalculation for custom function basically will not trigger the recalculation of general built-in functions. If you find this happen in your worksheet, please provide the related scenario and the way to generate the worksheet, then we can raise an internal bug to investigate.

pakoito commented 11 months ago

For anyone else showing up in this thread in the future:

4tti commented 11 months ago

Similar to #3332 @pakoito curious question - why do you think all OfficeJs functions are considered volatile? Thanks in advance for the answer!

paco-sparta commented 11 months ago

Similar to #3332 @pakoito curious question - why do you think all OfficeJs functions are considered volatile? Thanks in advance for the answer!

It's by design, as per the comment above:

https://github.com/OfficeDev/office-js/issues/3668#issuecomment-1718957827

4tti commented 11 months ago

In that comment only streaming functions are mentioned. I am pretty sure the non-streaming/non-volatile functions work kind of fine (with some hickups mentioned in the link wrote above).

sk1136 commented 9 months ago

our problem seems somewhat related. our excel freezes when we blend in our custom functions with 3rd part COM addins because 3rd party addins are for some reason triggering custom functions refresh. Custom function is getting called over 1000 times in a second when there is no need to refresh that many times.