Open nbevans opened 4 years ago
So an update on this.
We had the customer reboot the machine. We then tried to install the app again but this time the App Installer was producing error code of 0x8000FFFF (which is the same as https://github.com/MicrosoftDocs/msix-docs/issues/186) however it had a different error text which did not say "Catastrophic error". We then tried the Add-AppxPackage workaround and hey-presto it worked, the app was successfully installed.
There are a lot of weird bugs in the App Installer architecture...
I have now witnessed this error on my own machine and/or VM's, several times over the last week or so.
Simply rebooting the machine and trying again resolves it every single time.
I am seeing this error frequently as well, and have found it difficult to diagnose.
Any word on possible solutions for this?
Hi @nbevans, are you still seeing this issue? Is your scenario same as #186?
Closing this. Please reopen if you are still seeing the issue.
@samiranshah Sorry I missed this. Yes I saw this issue ("Unknown error") again just a couple weeks ago. A reboot of the machine fixed it.
We saw this issue today. A reboot did not fix it, unfortunately.
@samiranshah Please reopen as we are also seeing this after changing the version number from 1.0.xx.yyy to 1.1.0.0
A reboot did fix it here.
I've spent a few hours debugging this. Here's what's going wrong and how to work around it.
This looks like a (very serious!!) bug in the MSIX / Delivery Optimization subsystem of Windows. It appears Microsoft already know about it and have fixed it back in May, but haven't yet released the fix, at least not to Windows 10.
@wcheng-msft @samiranshah You guys need to document the bug and publish an official workaround if you aren't going to push out the fix ASAP because this is a nightmare to figure out, yet you clearly already know what the bug is! That's really not a satisfying discovery.
The Delivery Optimization subsystem that is used by App Installer appears to be incorrectly caching the size of the file given a URL, in memory. If an .appinstaller
file changes its size once published, then any machine that already downloaded it will try to download the exact same number of bytes as previously retrieved until the in memory cache is cleared with a reboot. This will result in one of two things happening:
@yowl This is why you see the failure once you changed the version number. The problem is not actually the change of version number but rather that the new version quad is a different size textually to the previous.
The situation may be complicated by HTTP redirects. I saw this problem myself before ever changing anything about the XML, and I think Windows cached the wrong size from the body of a 302 redirect from HTTP to HTTPS. So watch out for that if your server generates a lot of HTML when issuing redirects.
You can see that this bug is happening by dumping the Delivery Optimization subsystem logs: Get-DeliveryOptimizationLog | Set-Content c:\dosvc.log
then look at the end. If you study it very carefully you will notice that the size of the request that is being made is different to the actual size of the file on your web server.
Here's a stupid hack that seems to work:
appinstaller.xml
file with white space to a fixed size large enough that you'll never exceed it. If your XML grows, it'll eat into the white space which the XML parser will ignore. If it shrinks, add more whitespace to re-pad it to the old length. I'm generating the AppInstaller .XML myself rather than using any Microsoft tool so I can do this easily - if you aren't then you may need a custom script.(edited for clarity)
Thanks @mikehearn . Here's some powershell that I'm putting into my pipeline to do as you suggest
$size=(Get-Item "$($args[0])").length
Write-Host $size
$padSize=2000 - $size
Write-Host padding with $padSize
$padString = "".PadRight($padSize)
Add-Content "$($args[0])" $padString -NoNewline
@samiranshah Please reopen this. this has been happening for over 1.5 years. It's a shame the way MS has been dealing with this.
@mikehearn @yowl Thanks for the workaround but unfortunately it didn't work for me! I went ahead and added the powershell script to my pipeline but the issue still remains.
@mvelayati The first update may still fail as the cached xml could be, indeed would likely be, a different size, but after that you are saying it still fails?
@yowl Yeah I uninstalled whatever had installed on my PC, installed the app again with the new fixed size appintaller. After that I ran the pipeline again to upgrade the version number and still no luck!
It does seem to be holding up here. You've checked the size of the deployed file?
yeah I set the file size to be exactly 2000 bytes using the powershell script you posted above and then I FTP it over to my host.
Thanks for this fix idea. I'm going to put it in place a few weeks in advance of our next UWP rollout. That way it should give plenty of time for machine reboots and the cache to get cleared out. So by the time our update goes out - there shouldn't be any machines holding onto the unpadded cached copy.
@mvelayati Not sure what else to suggest. Have you got redirects as @mikehearn mentioned? If you ftp the file back locally and check it, it is 2000 bytes longs?
If it starts failing here, I'll report back that the script doesn't work.
@yowl Sorry I'm not familiar with http redirects and whether I've got that or not but I checked the DeliveryOptimization logs and noticed that the size of the request is indeed 2000 bytes. I also checked the file size on the host which was 2000 bytes as well.
If the DO logs show a request for the exact same size as on the server, and it still fails, you might be facing a different issue then. What is the exact error you get?
I just added 20kb of whitespace tail padding to our .appinstaller files. Expecting this to reliably fix these errors we were getting a lot:
The XML in the .appinstaller file is not valid ... (0xc00cee01)
The XML in the .appinstaller file is not valid ... (0xc00cee23)
It's genuinely crazy that such a critical piece of Windows infrastructure contains such a catastrophic caching bug and that it has gone unfixed for so long. I imagine this bug must affect hundreds of other components aside from just App Installer.
I suspect App Installer is probably the only thing seriously affected. If it affected Windows Update it'd have been caught and fixed long ago, but it's obvious that Microsoft isn't actually using AppInstaller internally.
Hypothesis: Delivery Optimization originally made the assumption that all static assets it'd be asked to download were immutable. When AI integration was added this assumption was violated but nobody noticed. The symptoms probably don't appear if you test by simply incrementing a version a few times, because until you do > 9 releases the version number won't change size, and the errors it generates can vary, and it's a caching bug so it will seem to come and go more or less at random. Overall hard to track down. However from another GitHub thread Microsoft mentioned that they already fixed it back in May and were looking at maybe backporting the fix ... clearly that never happened, which is what's really crazy here. The bug is guaranteed to eventually hit everyone using AppInstaller, yet, MS don't seem to care.
but it's obvious that Microsoft isn't actually using AppInstaller internally.
@mikehearn Priceless, and probably true 😁
If the DO logs show a request for the exact same size as on the server, and it still fails, you might be facing a different issue then. What is the exact error you get?
This is the error I get when I click on the download link to install from web.
If I launch the app from the start menu, the appinstaller checks for updates but does not find any and goes on to open the app.
@samiranshah Can you please reopen this issue? We still see this issue regularly.
@mvelayati Well, that error implies maybe you changed the size of AEther_Beta.msix? You can't do that - remember, the bug is that Delivery Optimization doesn't understand that files can change size. You should put the version number in the file name and never change an MSIX after it's been uploaded.
@mikehearn Yeah that was the problem. I changed the release pipeline so it appends the version number to the msix file name for each run and it works now! Thanks for the help!
Hi all, we are looking into this issue.
Looks like the issue has resolved itself with the following update: https://support.microsoft.com/en-us/topic/september-30-2021-kb5005611-os-builds-19041-1266-19042-1266-and-19043-1266-preview-a37f5409-f320-4175-9a66-c2682fc11c07
Can you all verify that this is the case? If not, please let us know so we can investigate.
Don't believe this is fixed as I get
Appinstaller operation failed with error code 0x80070490. Detail: Element not found.
Edition Windows 10 Pro Version 21H1 Installed on 20/10/2020 OS build 19043.1288 Experience Windows Feature Experience Pack 120.2212.3920.0
@yowl - Thanks for reporting the issue. We will need more logs to diagnose correctly. Could you please log more data by using one of the following via Feedback App if the machine is connected to the internet and uploads feedback. Please select the following Category: "Developer Platform" -> "App Deployment"
@dianmsft thanks for reviewing. I've submitted feedback through the app.
Original thread was here: https://github.com/MicrosoftDocs/msix-docs/issues/191#issuecomment-954833601
Perhaps @ranm-msft could comment?
@yowl - we couldn't find your feedback. Can you please let me know the title that you used? I want to double check a couple of places. Also to make sure that it is under: "Developer Platform" -> "App Deployment"
@dianmsft The title was "App Installer does not automatically update when appinstaller changes size" When I share a link it comes with https://aka.ms/AAesqha
I have a similar problem, submitted it via feedback hub https://aka.ms/AAetpms Hope it helps you @dianmsft
Thank you all for the help! I was able to find the feedback that @yowl submitted and we are looking at the logs now. I will provide more updates as they come.
It does seem to be holding up here. You've checked the size of the deployed file?
Unfortunately this doesn't seem to be the solution as we have this failing again even with padding the .appinstaller
Same error it seems in the event log:
Appinstaller operation failed with error code 0x80070490. Detail: Element not found.
and
AppX Deployment operation failed for package b90d235c-ba69-42d6-8f61-7ec79035b0f2_1.1.56.0_x86__g5440y1v6mj70 with error 0x80070490. The specific error text for this failure is: Appinstaller operation failed with error code 0x80070490. Detail: Element not found.
There's no updates available, this is the result of checking, status from package.CheckUpdateAvailabilityAsync
is Unknown
. And this is the same when an update is made available.
I do have WIndows Updates pending, so will install those and report back
Installing the updates and rebooting made no difference: same error. Downloading the appinstaller file locally and running Add-AppxPackage -appinstallerfile ....
even though there was no update fixed it and the next update did install automatically, after that however back to "Element not found". No change in version number size, and padding anyway to 2000 characters.
So maybe there is another workaround, Detect the failure in code using package.CheckUpdateAvailabilityAsync
, when it errors, download the appinstaller manually and process exec powershell with Add-AppxPackage.
@yowl sorry to hear that. Can you attach your appinstaller.xml file in a form that preserves it bytewise (i.e. as a raw file or inside a zip)? I'd like to take a look if you don't mind, maybe there's a way to spot what's going wrong. Also your DO logs would be helpful.
@mikehearn Sorry, but I can't post the appinstaller publically. I don't see anything at all in the DO logs when starting the application or requesting an update, i.e. nothing with a recent timestamp at all using get-deliveryoptimizationlog
. The last thing is from 10 minutes before starting the app:
TimeCreated : 17/11/2021 12:32:00
ProcessId : 4792
ThreadId : 16788
Level : 4
LevelName : Info
Message : Swarm: 95D2EE60-C9D3-45E4-876D-BAE16D758A87_16_0_14527_20276_STREAM_X64_X_NONE, announce complete, event: 3, hr: 0, server: [20.54.24.169]:443 (https://array613.prod.do.dsp.mp.microsoft.com/join/), nPeers: 0, response: {"FailureReason":null,"NextJoi
nTimeIntervalInMs":1006171,"Complete":0,"Incomplete":0,"Rediscover":false,"KVVersion":"99DCDFC4E43CE828743C620D845168FC8534879E6DCDACD58CDD25D9BBCFA430","GeoVersion":"A1B4408488D89015552A70A33860DA34F8FB14BE561530B7ADAC2F49C95A7A8B","Peers":[],"Leav
e":false}
Function : CAnnounceSequencer::OnComplete
LineNumber : 187
ErrorCode :
Appx-DeploymentServer continues to report:
Appinstaller operation failed with error code 0x80070490. Detail: Element not found.
Clicking the appinstaller link from the web page, detects the update successfully and installs successfully. I don't suppose there's any pdb files for this code?
OK, no problem. The "Element not found" error is subtly different to the error codes discussed earlier in the thread. I'm wondering if the client is receiving any XML at all. The server in this case is nothing fancy, just an Apache or IIS or something? The web server logs definitely show a normal HTTP 200 response to the client?
@mikehearn I think you are right, there is no activity in NetMon when checking for an update as though the previous update has corrupted its ability to check for the next update. add-appxpackage -appinstaller https:.....
is enough to allow package.CheckUpdateAvailabilityAsync
to succeed again.
Just had this with App Installer. Another "Catastrophic failure (0x8000ffff)"... tried everything, several reboots. In the end Add-AppxPackage
was the only way to get the app to install.
Honestly thought these issues were starting to be behind us now with the newer builds of App Installer.
@mikehearn We had also discovered the very same bug several month ago, and it seems the participants of this github discussion are the only ones who tries to adopt App Installer in production 🤣
The situation becomes even more comic in the light of resentment of Microsoft Store by the average Windows customer: it seems we have no proven and bug-free technology to distribute and autoupdate MSIX packages...
@Nipheris Welcome to the wonderful world of dealing with M$. Because of this insanely stupid bug, I ended up abandoning the appinstaller, and having my users do manual updates. It's really sad how much MS really cares about this.
participants of this github discussion are the only ones who tries to adopt App Installer in production 🤣
That's not very nice. There is no other choice than to use it.
From a quick look around I suspect E_UNEXPECTED ("Catastrophic Failure") is returned when an exception is thrown internally, i.e. an assertion failure or a crash of some sort. The only way to figure out what that might be is to find logs that contain more useful error messages. I don't know what logs might contain that if the DO logs don't.
The fact that these problems are resolvable by using PowerShell and/or specifying the AppInstaller URI to the command directly, means that the errors are not in the files themselves but rather the codepaths used for downloading them. Perhaps, the Add-AppxPackage
cmdlet downloads the AppInstaller file without using Delivery Optimization, in that case.
DO is a very complicated thing. If I were the AppInstaller team I would be getting way more defensive about this subsystem, for instance by failing over to the WinHTTP stack and doing a normal download via the more mature codepaths if anything goes wrong the update checking/downloads codepath. At any rate I hope MS is paying attention here. Just constantly asking for Feedback Hub logs doesn't cut it; once you reach that point it's just way too hard to diagnose problems especially given the patchy logging that seems to exist (c.f. "catastrophic failure"). If you look at a mature and battle tested update system like Google Omaha then it has a variety of fallback codepaths that are used if anything goes wrong, because losing the ability to keep users up to date is a major problem.
One robustness move that might be worth exploring is to take update checking responsibility into the app itself. At startup it would download the .appinstaller file and parse it. If the user has fallen too far behind then it can invoke the cmdlet via PowerShell directly. If that doesn't work, it can try downloading the MSIX and passing the local file to AppInstaller instead.
Similar to https://github.com/MicrosoftDocs/msix-docs/issues/186 - but a different error code. Also, with this error code - the workaround to use Add-AppxPackage via PowerShell does not work. So the .appinstaller cannot be installed at all on machines that exhibit this issue - we are unsure currently what to advise to the customer short of "try reinstalling your machine from scratch".
Here is a Feedback Hub link to captured diagnostic data when attempting to install via App Installer: https://aka.ms/AAa0oga