OfficeDev / office-js

A repo and NPM package for Office.js, corresponding to a copy of what gets published to the official "evergreen" Office.js CDN, at https://appsforoffice.microsoft.com/lib/1/hosted/office.js.
https://learn.microsoft.com/javascript/api/overview
Other
671 stars 95 forks source link

Excel unnecessarily caches custom function metadata JSON, not respecting cache-control header sent by the server #2777

Open wh1t3cAt1k opened 2 years ago

wh1t3cAt1k commented 2 years ago

I feel like we need to ask it here because we really look bad in front of our customers every time we update the add-in code to a new version.

Our add-in is distributed through AppSource.

We would like to get to the bottom of how exactly add-in assets are:

I was not able to find this in open documentation, but this now seems a crucial piece of knowledge given the severity of the problems we're seeing.

The flow we have been using over the last couple of years was like this:

This flow, historically, worked out pretty well for us so that we didn't need to go through lengthy re-review process every time we delivered a hotfix to our code.

Frankly, it makes sense to expect it to work because our server sends "must-revalidate" headers with all JS/HTML/JSON assets and we assumed Excel would respect them.

Lately, however, (in the past few months) we have started experiencing crazy issues every time we pushed out a new version of add-in to production without resubmitting the manifest.

I have a gut feeling that all those are linked to the fact that the flow we're using has some hidden pitfalls and we would like to understand what they are, and why.

We definitely would like to avoid the process of having to maintain several parallel versions of assets and go through store resubmission on every small change we make, but if this is unavoidable, I want to at least confirm this.

Velixo internal tracking item

JHJ-MS commented 2 years ago

Hi @wh1t3cAt1k ,

1. The following steps you mentioned to update a Custom Functions Addin are exactly what we suggested.

If just the JS / custom function metadata JSON were changed, we deliver the update directly to our production server where the add-in is served from. If manifest was changed in any way, we resubmit the manifest through the partner centre.

2. Regarding to these two questions:

Cached by the host - do they respect the Cache-Control headers sent by our server? If not, why not?.. Are there any additional pitfalls?

Yes, Excel would respect them.

How the update process works when a new version is available in the store?

It will fetch the latest JSON metadata file and Js file according to the latest manifest.

3. Regarding this issue:

Lately, however, (in the past few months) we have started experiencing crazy issues every time we pushed out a new version of add-in to production without resubmitting the manifest.

Infinite update loops in Excel Online, where one would see a prompt like "A newer version of the add-in is available" - one would click "Update" and nothing would happen (the add-in then becomes completely unusable because of repeated prompts popping up, unless removed and reinstalled from the office store)

Only the manifest updated will cause the "A newer version of the add-in is available" prompt to pop up, the other JS/JSON file change will not cause it. Guess it may be related to your early manifest change. It may be a bug and we are connecting our partner team to take a look at this issue.

Thanks.

wh1t3cAt1k commented 2 years ago

@JHJ-MS thanks for the clarifications. This just in, however, could you guys check just in case?:

We have updated the bundle (and the metadata JSON file) because MS provides a new TEXTSPLIT function, and we had another one with the same name, but different signature.

So, we renamed the function to TEXTSPLIT -> VXTEXTSPLIT and updated the JSON + JS bundle without resubmitting the manifest.

The report states that they opened Excel for Mac, where the store version of the add-in was installed, but the change did not come through. Clearing the cache through the personality menu did not help either:

image

As per your message above, clearing the cache and reopening of Excel should not be required, especially since after cache clearing the users lose their add-in settings stored in the local storage :(

We do send all proper cache-control headers for our custom functions metadata JSON and the JS bundle, asking the client to always revalidate.

Maybe it's a Mac-specific issue? But I'd like to get to the bottom of it as it seems linked to other weird post-update problems for us...

JHJ-MS commented 2 years ago

Hi @wh1t3cAt1k, 1.regarding your issue On Excel Online

Infinite update loops in Excel Online, where one would see a prompt like "A newer version of the add-in is available" - one would click "Update" and nothing would happen (the add-in then becomes completely unusable because of repeated prompts popping up

If you can still repro this issue could you please provide a Fiddler trace for us?

2.Regarding the issue you mentioned in your comment, Could I confirm with you whether your addin is using Shared Runtime? If your add-in is using Shared Runtime, would you please first go through the comments of this issue to see whether it is helpful?

wynntee commented 2 years ago

@JHJ-MS , in my experience Excel does NOT respect cache-control for the JSON metadata file. It respects for the other files (HTML and JS) but definitely not JSON.

In fact, I raised this suggestion at Microsoft 365 Developer Platform and some users agree. It needs more attention from MS.

wh1t3cAt1k commented 2 years ago

@JHJ-MS correct, we are using the shared runtime.

However the issue you linked indeed makes an impression that Excel does not respect cache control headers for metadata, as @wynntee mentioned in his last message.

Moreover, from what I understand, the client in the linked issue was eventually forced down the manifest resubmission route on every update. This is very inconvenient to say the least as it will require people

a) to version and store on the servers multiple versions of metadata JSON and potentially other assets b) to resubmit the manifest on every minor change.

I am sure you will agree with me: what's even the point of caching the metadata JSON file? šŸ˜Š Compared to the HTML/JS bundle itself (which does respect cache control headers!), the metadata JSON is a very small asset, with a must-revalidate header it takes a trivial time to check and redownload if needed!

Since above we established that it should not be the case:

image

and since the change on Excel side is probably easy - I suggest to treat this as a bug and fix because it's severely detrimental to our user experience... I.e. once we deliver an update, we would like people to see the result right away after relaunching Excel, and not after some non-deterministic amount of time, in which period people will see inconsistent results and custom functions.

Right now we have to ask them to uninstall and reinstall the add-in from the store, which means they lose all settings from the add-in local storage - and we're getting unwarranted support request volume, too :(

Please help us out with this guys!

JHJ-MS commented 2 years ago

Hi @madhavagrawal17 , from your comment in this link and @wh1t3cAt1k and @wynntee 's comment above, it seems that When using Shared Runtime, Excel does NOT respect "must-revalidate" headers for the JSON metadata file. Could you provide more info on why we do NOT respect cache-control for the JSON metadata file like other files (HTML and JS)? Is it a bug?

Hi @wh1t3cAt1k ,for the repeated prompts popping up the issue on WAC, if you can still repro this issue could you please provide a Fiddler trace for us? Or would you please provide your add-in name so that we can repro it on our machine?

wh1t3cAt1k commented 2 years ago

@JHJ-MS we only reproduced infinite update loops after a JS update having been delivered to the bundle without resubmitting the manifest. Because there are real people using the product and this is our company reputation, we are reluctant to live-test this again.

As a side note, it would be immensely helpful if we could have some sort of alpha-track in AppSource store for our internal team to test / troubleshoot such update issues before rolling this out to users. It seems that sideloading mechanism works quite differently from the store installations in that regard.

madhavagrawal17 commented 2 years ago

@JHJ-MS , there is no special implementation for Shared vs Non-shared Runtime for caching. The only difference is for Non-Shared Runtime we download and cache both JSON metadata file and JS file but for Shared Runtime we just download and cache JSON metadata file as the JS file comes from the HTML page. For caching we have an expiration time of 24 hours. After 24 hours we make a request and do respect the Cache headers and update the new JSON metadata. It doesn't requirement re-installation of the add-in, but it will only pick up the new change on the next restart of Excel. As per the issue above changing an existing Function name should be a breaking change for the user and should be handled gracefully by either submitting the new manifest version so that it will force update the manifest file along with the resources (JSON Metadata) or by creating the new function and gradually deprecating the older function and not renaming it.

wh1t3cAt1k commented 2 years ago

@madhavagrawal17 we're not necessarily talking about a change to the existing function name, it applies to adding a new function in equal measure.

It really feels wasteful to force the whole re-review process upon adding any minor function to the JSON just because the software does not respect the cache-control header (and quite unexpectedly at that).

It's not entirely clear why JSON is "special" in that regard - regardless of whether we update manifest or not, the user doesn't really have a choice not to update the add-in, because otherwise Excel will just keep prompting them for an update until they do, from our experience.

Therefore, I suggest to change the implementation to fully respect the HTML headers that our server sends regarding the HTML/JS/JSON assets.

wynntee commented 2 years ago

@madhavagrawal17 , I agree with @wh1t3cAt1k . In addition to his scenario of adding new functions, I also update the JSON file to: 1) Fix any typos in the description/documentation. 2) Add new parameters to a function in a backward-compatible manner.

Scenario (1) is less urgent, but important nonetheless. In scenario (2), I update the HTML/JS files to match the new parameters. My HTML file includes text informing users about the changes and documentation on how to use the new parameters. My JS file also expects a different number of parameters now, so the last parameter (which is the invocation object) is now in a different position.

So if Excel respects the Cache header for my HTML/JS file, but NOT the JSON file... my users will learn about the extra parameter and want to try it out, but they will be disappointed because the old JSON file is still being cached.

Please also note that the AppSource submission FAQ states that "If you make updates to the web service for your app, you don't have to resubmit it." When Add-in developers update the "web service", they update the HTML/JS/JSON bundle all together.

TLDR; it doesn't make sense to respect the Cache header for only HTML/JS but not for JSON until after a 24-hour delay (which I've never observed in experience).

sshukurov commented 2 years ago

As a software engineer I have to say it is greatly inconvenient to have such a distinguishing experience with development versus production add-ins, which is worsened further by the inexistence of a channel / path for beta-testing in a production-like environment.

The issue raised by @wh1t3cAt1k is putting a significant strain on our development and support teams, while the workaround we currently have with uninstalling / reinstalling the add-in is a highly unwanted inconvenience for the end users.

Looking forward for the folks at MS to pay a proper attention to the importance of this and take necessary measure(s).

madhavagrawal17 commented 2 years ago

@wh1t3cAt1k, for repeated prompts popping up issue on WAC I think @JHJ-MS will help you. For the JS/JSON update and caching issue I still want to understand the issue correctly. As per us if you define a new Custom Function in the JSON metadata file then until the new JSON metadata is fetched the user won't be able to use the Custom Function (currently we have 24 hours caching expiration time on win32 and mac), even if the JS function on the HTML side is up to date it won't get invoked until the new Custom Function is installed. So, I would like to understand the developer/user scenario to understand this issue properly.

wh1t3cAt1k commented 2 years ago

@madhavagrawal17 for the best pain point scenario description, I would refer to the @wynntee 's message above:

In addition to his scenario of adding new functions, I also update the JSON file to:

  • Fix any typos in the description/documentation.
  • Add new parameters to a function in a backward-compatible manner. Scenario (1) is less urgent, but important nonetheless. In scenario (2), I update the HTML/JS files to match the new parameters. My HTML file includes text informing users about the changes and documentation on how to use the new parameters. My JS file also expects a different number of parameters now, so the last parameter (which is the invocation object) is now in a different position.

To be clear, we definitely wouldn't want this to move in the backwards direction (towards arbitrary 24h caching of everything), but rather, towards respecting the cache-control headers for every kind of asset.

JHJ-MS commented 2 years ago

Hi @wh1t3cAt1k, for the repeated prompts popping up the issue on WAC, totally understand the repro will affect your add-in users. Sorry that due to it can not repro now, we would not continue to follow this issue then. Thanks.

wh1t3cAt1k commented 2 years ago

@JHJ-MS regarding those repeated prompts, definitely would be interested in seeing some sort of alpha / beta channel mechanism in AppSource so that these rough edges can be smoothed out internally first without impacting the clients right away.

Who is the best person to talk to about that?

====

Otherwise, @madhavagrawal17 @zlatko-michailov as I understand we're moving forward with JSON cache header compliance? Thanks for the investigation and your help in this matter.

wh1t3cAt1k commented 2 years ago

@JHJ-MS @madhavagrawal17 @zlatko-michailov just pinging you for the update regarding custom function metadata JSON cache header compliance.

wh1t3cAt1k commented 1 year ago

@JinghuiMS @madhavagrawal17 @zlatko-michailov I would like to scope this issue to the last problem remaining on our side: the fact that Microsoft Excel still seems to unnecessarily cache the custom function metadata JSON regardless of the no-cache header sent by the server.

This does not allow us to update function signatures in a non-breaking way on the fly.

Obviously I don't know the architectural details, but from the outside it doesn't seem like a very complex bug to fix: I would be really grateful if this is fixed at last :)

wh1t3cAt1k commented 1 year ago

I also believe that a "product question" label is incorrect for this item, this is a bug where the software (Excel) behaves in a non-compliant way to the browser standards...

@4tti not sure if this is something your team experienced, but tagging you here just in case you have any input :)

zlatko-michailov commented 1 year ago

@wh1t3cAt1k I don't think the assumption that a function signature could be safely changed is a good one. That is because your users may have already saved workbooks with formulas that use the old signature. Changing that signature could break those formulas even if the metadata and the JS match.

To be safe, consider giving the new signatures new names.

P.S. As far as caching of metadata goes, that cannot change. Function signatures must get registered on Excel boot , so the user can see auto-complete in the formula bar right away. This is a fundamental design decision.

wh1t3cAt1k commented 1 year ago

@zlatko-michailov I see. However, changing parameter names/documentation or adding parameters at the end is not really breaking, which we argued above in this thread. Perhaps most importantly, adding new functions which should become visible immediately!

I regret to hear it's a fundamental design decision given the dynamicity and control we have come to respect and expect from the COM/XLL add-ins we're trying to migrate from.

I believe this should be left in control of the add-in author at least, in that the cache-control header should be respected as per standards. Sure, by default cache away, but if he wants to "risk breaking things" by doing an OTA update he should be able to, if he explicitly says he knows what he's doing: "no-cache"

Regarding Intellisense ā€” just like it's not present at all when the add-in is not yet installed, that's not at its core different from a situation where IntelliSense would simply "update" after the new metadata is fetched - so no need for this perk to go away, in my opinion.

wynntee commented 1 year ago

@zlatko-michailov , the existing Add-in process/architecture contradicts your reasoning, because it is only the JSON metadata that is cached. The JS function file that accompanies the JSON is respected.

So if your reasoning were true, then Microsoft should permanently cache BOTH the JSON and JS function files.

The problem is Microsoft is caching one (JSON) but not the other (JS). So which is it going to be?

wynntee commented 1 year ago

In addition to my previous comment, my use case involves adding new functions, not changing existing ones. Right now my older users would not be able to see newer functions, because the JSON is cached... but their Excel would have downloaded the newer JS file... hence the contradiction.

4tti commented 1 year ago

Thanks @wh1t3cAt1k for tagging. Yes, this caching issue is tedious and bothersome even for us. When we release new version with some changes that need JSON file to be replaced, we request customer to do cache clean -> the product then looks really amateurish.

@zlatko-michailov I totally agree that this sounds like a bug rather than "design decision" because of the points mentioned above by @wh1t3cAt1k and @wynntee. Moreover, how does it correspond with ScriptLab's capability to dynamically register custom functions? Microsoft has implemented CustomFunctionsManager where you can provide the 'JSON definition' and have CFs added dynamically... maybe it is not officially supported, but it works. Would be so amazing to have it officially supported, then whoever needs 'dynamic' approach would just use it without complaints about caching.

zlatko-michailov commented 1 year ago

@4tti @wynntee @wh1t3cAt1k Thank you for your passionate dedication to custom functions!

Let me provide some history. Hopefully, it will clarify why things are the way they are.

Originally, CFs were only executed in a JavaScript/ReactNative runtime. That is still the default option. For that, both a JSON metadata and a pure JS file need to be downloaded and cached. They get downloaded at the same cadence. If you use this option, your users will see the latest additions with some delay, but at least the JSON and the JS should be in sync. While the ReactNative runtime is faster and leaner, it is just a JS engine. It is not a full-featured browser - some browser APIs are missing.

For that reason, "shared runtime" was added later. It is a full-featured browser control. You can count on all standard browser APIs and behaviors. However, what happens inside it is out of our control - we only navigate to the HTML URL you've provided. When you opt in for "shared runtime", you gain features at the expense of time and memory plus the risk that browser control and Excel may behave differently, which is what's happening.

A general workaround for users to get the latest resources is the Refresh button (the little circular arrow in the upper-right corner of the Insert Add-ins dialog.) When it is clicked, manifests of installed add-ins are redownloaded along with referenced resources - JSON metadata, JS, etc. This may be subject to some throttling. I apologize I am not very familiar with the throttling policies. As far as I know, this is the recommended option for users to get the latest resources even if there hasn't been a manifest update in the store.

There is a strong requirement that once an add-in with CFs is installed, when a user starts Excel and opens a workbook, IntelliSense for those installed CFs should be ready immediately. That is the reason for the metadata caching. And that is why we cannot change it.

The suggestion to use the CustomFunctionsManager, like ScriptLab does, is interesting. However, it requires the add-in to be already activated (so it can make those calls), which also requires the whole add-in framework to be loaded. That is something that cannot be done automatically on Excel boot (or on workbook open) for performance reasons. The user will have to manually trigger it, e.g. by clicking on a ribbon button that your add-in provides. However, that would be kind of equivalent to asking the user to click on the Refresh button of the Insert Add-ins dialog.

Anyway, if you are interested in the latter approach, I can ask around if it is possible to expose the CustomFunctionsManager API for all add-ins to use. You may be able to figure out an experience that works for your users.

wh1t3cAt1k commented 1 year ago

@zlatko-michailov thanks for the background!

I understand it's a strong requirement for IntelliSense to be available immediately.

Perhaps some sort of hot swap is possible later on though, even if only for shared runtime add-ins - as soon as the latest metadata is downloaded.

BTW, would be curious how many add-ins, percentage-wise, are not yet shared runtime, because it is clearly a superior option...

4tti commented 1 year ago

@zlatko-michailov thanks a lot for clarifications.

One more curious question - would it be possible to combine both approaches, i.e. have JSON/JS caching for booting/workbook open but once done and ready, after startup the addin would have capability register "updated" version of CFs through CustomFunctionsManager.

If you are interested in the background from our side, please check #2878 where I have explained why we would be so extremely happy about dynamic approach for CFs (but also ribbon/shortcuts, etc.). Again thanks a lot.

parched commented 1 year ago

Thanks @zlatko-michailov for the clarifications, that helps explain some of the behaviour I've seen.

We've been using the undocumented CustomFunctionsManager for a while now, so I'll describe our workflow.

An issue we have with this workflow is that

I've thought about preloading the functions JSON referenced from the manifest with the current definitions from prod, as this will be mostly correct all API environments. However, there's another reason I'd like to load it from the shared runtime. Currently, our API and add-in are only accessible on the corporate network, but I would like to make it public and control access using AAD. Our custom functions metadata also includes information that I would like to restrict to authorized users too. I don't know of another of doing this except for requiring sign in via the taskpane first.

4tti commented 1 year ago

@parched do I understand correctly that the combined approach (i.e. using manifest and eventually dynamic (re)registration) works fine? That would be awesome. Having manifest file for "booting" and then update it after opening task pane.

An issue we have with this workflow is that

  • Some (or maybe all) of the time, before the addin is opened, existing functions in the workbook show up as _xlfn....

Actually that might be related to this: #2568 couldn't it?

parched commented 1 year ago

@4tti Yes, from my experience, that works, However, there may be some edge cases I haven't come across. I did find, for it to work reliably, the taskpane had to be loaded, and Office.onReady wasn't sufficient. i.e,. we do this

Office.onReady(async () => {
  if (!Office.context.requirements.isSetSupported('CustomFunctions', '1.3')) {
    console.log('Disabling custom functions because they are not supported on this platform');
    return;
  }

  await Office.addin.showAsTaskpane();

  // For some reason, we can't register custom functions until the add-in is shown as a taskpane
  console.log('Checking custom function status');
  const status = await (Excel as any).CustomFunctionManager.getStatus();
  console.log(`Custom function status: ${JSON.stringify(status)}`);

  await registerCustomFunctions();
});

Actually that might be related to this: https://github.com/OfficeDev/office-js/issues/2568 couldn't it?

Yes, we do use COM add-ins, and it looks like they've recently made a fix, so I'll have to retest.

4tti commented 1 year ago

Thanks @parched !

@zlatko-michailov this would be an options for us (see post above) - just please make the CustomFunctionManager publicly available and supported! It would really help.

zlatko-michailov commented 1 year ago

Folks, thank you again for using custom functions and being so passionate about making it better. While I played a key role in the original design and implementation, my team no longer owns custom functions.

@yaweizhu-henson is from the team that currently owns it. He will take over this issue, He and his team will evaluate what and when can be done.

If there are any design/historic questions, feel free to reference me.

4tti commented 1 year ago

Thanks a lot @zlatko-michailov @yaweizhu-henson can you please update us ASAP?

4tti commented 1 year ago

Any update @yaweizhu-henson ?

wh1t3cAt1k commented 1 year ago

@adrianwu8516 (cc @4tti) I wanted to follow up on this issue with a rather ugly (and unexpected) bug resulting from this behaviour.

Today we added a new optional parameter to one of our functions and deployed an update to both the metadata and the JS bundle.

This was supposed to be a completely non-breaking change, unfortunately we immediately received a support request from the client.

The reason?

The only thing that worked was a complete uninstall and reinstall of the add-in, which I believe should never be normal course of action.

One more time I take an opportunity to point out that this is the sort of experience that makes the UX with modern add-ins so sub-par compared to the completely dynamic COM add-ins of the past...

Also I'd like to emphasize:

wh1t3cAt1k commented 9 months ago

@adrianwu8516 @JinghuiMS @JHJ-MS @yaweizhu-henson

Unfortunately our product suffered a public regression again due to the same problem.

As usual, we added a new optional repeating argument to two of our functions, and deployed an update to both the metadata and the JS bundle.

However, our JS code did not expect that instead of a new repeating argument (which normally always has a shape of an array), Excel would pass us a null (because metadata was still old).

It is extremely dangerous and irksome that our JS code should actively defend against being out of sync with the JSON metadata that can be cached by Excel!

This unconditional caching is so evil! I beg Microsoft to remedy this problem and get rid of any metadata caching in the product - or, at least, please respect the no-cache / must-revalidate headers and check if the actual metadata JSON has changed.

ashern commented 6 months ago

Is there any update on this? @yaweizhu-henson

KishanVaishnani commented 4 months ago

Is there any update on this?

wynntee commented 4 months ago

I doubt there will be any good news coming out of this. I filed the same issue (#3316) in Jan 2022, and after some acknowledgement and chatter, it got closed as Not Planned in Apr 2024.

wh1t3cAt1k commented 3 months ago

Obviously I can't speak on behalf of MSFT but I'd like to believe their team acknowledges this design issue, and IMO it's important that more people beside us speak up!

@adrianwu8516 I don't know if it's the host application team or the Office.js team that would take care of this, but please review/ping if you can, that there are other voices about the same issue.

parched commented 3 months ago

FWIW, to alleviate a few of these issues with custom functions, specifically:

I'm planning to move our add-in to have only 1 custom function called EVALUATE (or similar) which has repeating parameters like

function EVALUATE(fn_name: string, args: any[][][]): any[][] {}

which is called in a cell like =EVALUATE("ADD", 1, 2).

This way we can handle all everything ourselves, like updates to functions, argument validation, etc.

It does loose any intellisense and will require us to create our own function help in the taskpane, but I think the benefits are worth it.

wh1t3cAt1k commented 3 months ago

@parched IMO this is just equivalent of you giving up and choosing (a very UX-poor) workaround...

In my experience, Microsoft has recently been responsive to the ISV voice and I think you should not give up just yet - we are seeing many of the issues we've reported over the last few years finally being fixed, not very fast, but steady.

Don't give up, present and argue your case!

sshukurov commented 3 months ago

Dear OfficeJS team (@adrianwu8516, @JinghuiMS, @JHJ-MS),

The present issue is causing so much inconvenience that folks like @parched are attempting to overcome this by "reinventing the wheel", adding their own infrastructure on top of OfficeJS, essentially mirroring part of OfficeJS functionality. I start feeling unwell when I think about maintenance and other costs of that approach.

The paint points raised in the present and related issues could be treated as separate actionable items and approached in the order of importance. The biggest pain here is the caching of the metadata JSON, and from my perspective it feels like it shouldn't be complex to e.g. expose an API in OfficeJS to reset the cache / reload the metadata.

Can you offer any update?

wh1t3cAt1k commented 2 months ago

@sshukurov the API could be a workaround / additional safety measure, but I believe it would be much better if HTTP no-cache headers were respected in the first place...

rgold5950 commented 3 weeks ago

Also experiencing issues with Custom Function registration breaking all of a sudden.