microsoft / navcontainerhelper

Official Microsoft repository for BcContainerHelper, a PowerShell module, which makes it easier to work with Business Central Containers on Docker.
MIT License
381 stars 243 forks source link

Dependent Assembly Microsoft.Windows.Common-Controls.Resources,language="*",processorArchitecture="x86",publicKeyToken="6595b64144ccf1df",type="win32",version="6.0.17763.5936" could not be found. #3588

Closed DanielGoehler closed 2 months ago

DanielGoehler commented 2 months ago

Describe the issue Did something change in the Docker image creation process for Windows Server 2019 between 2024-06-17 and 2024-06-27? In addition to issue #3585, BC15 to BC18 images created at the beginning of last month worked fine. Now, although the creation process produces no errors, the BC15 to BC18 Docker containers do not become healthy and cannot be accessed through Traefik. When I check the event log, I see this error 10-20 times:

A clear and concise description of what the issue is.

Activation context generation failed for "C:\Windows\SysWOW64\Msi.dll".
Dependent Assembly Microsoft.Windows.Common-Controls.Resources,language="*",processorArchitecture="x86",publicKeyToken="6595b64144ccf1df",type="win32",version="6.0.17763.5936" could not be found.

We rolled out BCContainerHelper 6.0.18 on 2024-06-03. The last working BC16 container was created on 2024-06-17. We noticed the first problems 4 days ago, on 2024-06-27.

image

Scripts used to create container and cause the issue

New-BcContainer `
  -accept_eula `
  -containername test-bc18 `
  -artifactUrl "https://bcartifacts-exdbf9fwegejdqak.b02.azurefd.net/onprem/18.18.46920.0/de" `
  -PublicDnsName dockerhub.somecompany.com `
  -useTraefik `
  -Credential $credential

Full output of scripts

BcContainerHelper version 6.0.18
Setting usePwshForBc24 = False
BC.HelperFunctions emits usage statistics telemetry to Microsoft
Running on Windows, PowerShell 5.1.17763.5933
BcContainerHelper is version 6.0.18
BcContainerHelper is running as administrator
HyperV is Enabled
Host is Microsoft Windows Server 2019 Standard - 10.0.17763.5936
UsePsSession is True
UsePwshForBc24 is False
UseWinRmSession is allow
UseSslForWinRmSession is True
Docker Client Version is 19.03.5
Docker Server Version is 19.03.5
Removing Desktop shortcuts
Fetching all docker images
Fetching all docker volumes
Enabling SSL as otherwise all clients will see mixed HTTP / HTTPS request, which will cause problems e.g. on the mobile and modern windows clients
Using image mcr.microsoft.com/businesscentral:ltsc2019
PublicDnsName is dockerhub.somecompany.com
Creating Container test-bc18
Style: onprem
Multitenant: No
Version: 18.18.46920.0
Platform: 18.0.46905.0
Generic Tag: 1.0.2.38
Container OS Version: 10.0.17763.5936 (ltsc2019)
Host OS Version: 10.0.17763.5936 (ltsc2019)
Using process isolation
Using locale de-DE
Adding special CheckHealth.ps1 to enable Traefik support
Disabling the standard eventlog dump to container log every 2 seconds (use -dumpEventLog to enable)
Patching container to install ASP.NET Core 1.1
Downloading C:\ProgramData\BcContainerHelper\Extensions\test-bc18\my\dotnetcore.exe
Additional Parameters:
--expose 5986
-e webserverinstance=test-bc18
-e publicdnsname=dockerhub.somecompany.com
-l "traefik.protocol=https"
-l "traefik.web.frontend.rule=PathPrefix:/test-bc18"
-l "traefik.web.port=443"
-l "traefik.soap.frontend.rule=PathPrefix:/test-bc18soap;ReplacePathRegex: ^/test-bc18soap(.*) /BC$1"
-l "traefik.soap.port=7047"
-l "traefik.rest.frontend.rule=PathPrefix:/test-bc18rest;ReplacePathRegex: ^/test-bc18rest(.*) /BC$1"
-l "traefik.rest.port=7048"
-l "traefik.dev.frontend.rule=PathPrefix:/test-bc18dev;ReplacePathRegex: ^/test-bc18dev(.*) /BC$1"
-l "traefik.dev.port=7049"
-l "traefik.snap.frontend.rule=PathPrefix:/test-bc18snap;ReplacePathRegex: ^/test-bc18snap(.*) /BC$1"
-l "traefik.snap.port=7083"
-l "traefik.dl.frontend.rule=PathPrefixStrip:/test-bc18dl"
-l "traefik.dl.port=8080"
-l "traefik.dl.protocol=http"
-l "traefik.enable=true"
-l "traefik.frontend.entryPoints=https"
--env customNavSettings=PublicODataBaseUrl=https://dockerhub.somecompany.com/test-bc18rest/odata,PublicSOAPBaseUrl=https://dockerhub.somecompany.com/test-bc18soap/ws,PublicWebBaseUrl=http
s://dockerhub.somecompany.com/test-bc18
Files in C:\ProgramData\BcContainerHelper\Extensions\test-bc18\my:
- AdditionalOutput.ps1
- AdditionalSetup.ps1
- CheckHealth.ps1
- dotnetcore.exe
- HelperFunctions.ps1
- MainLoop.ps1
- SetupVariables.ps1
- updatecontainerhosts.ps1
Creating container test-bc18 from image mcr.microsoft.com/businesscentral:ltsc2019
6d72644195f4933a9949826075da33dac6ff4e57b2ecf8a8301b426b9f73fcf4
Waiting for container test-bc18 to be ready
Installing ASP.NET Core 1.1
Using artifactUrl https://bcartifacts-exdbf9fwegejdqak.b02.azurefd.net/onprem/18.18.46920.0/de
Using installer from C:\Run\150-new
Installing Business Central: multitenant=False, installOnly=False, filesOnly=False, includeTestToolkit=False, includeTestLibrariesOnly=False, includeTestFrameworkOnly=False, includePerformanc
eToolkit=False, appArtifactPath=c:\dl\onprem\18.18.46920.0\de, platformArtifactPath=c:\dl\onprem\18.18.46920.0\platform, databasePath=c:\dl\onprem\18.18.46920.0\de\database\Demo Database NAV 
(18-0).bak, licenseFilePath=c:\dl\onprem\18.18.46920.0\de\database\Cronus.flf, rebootContainer=True
Installing ASP.NET Core 1.1
Installing from artifacts
Starting Local SQL Server
Starting Internet Information Server
Copying Service Tier Files
c:\dl\onprem\18.18.46920.0\platform\ServiceTier\Program Files
c:\dl\onprem\18.18.46920.0\platform\ServiceTier\System64Folder
Copying PowerShell Scripts
c:\dl\onprem\18.18.46920.0\platform\WindowsPowerShellScripts\Cloud\NAVAdministration
c:\dl\onprem\18.18.46920.0\platform\WindowsPowerShellScripts\WebSearch
Copying Web Client Files
c:\dl\onprem\18.18.46920.0\platform\WebClient\Microsoft Dynamics NAV
Copying Client Files
c:\dl\onprem\18.18.46920.0\platform\LegacyDlls\program files\Microsoft Dynamics NAV
c:\dl\onprem\18.18.46920.0\platform\LegacyDlls\program files\Microsoft Dynamics NAV
c:\dl\onprem\18.18.46920.0\platform\LegacyDlls\systemFolder
Copying ModernDev Files
c:\dl\onprem\18.18.46920.0\platform
c:\dl\onprem\18.18.46920.0\platform\ModernDev\program files\Microsoft Dynamics NAV
Copying additional files
Copying ConfigurationPackages
C:\dl\onprem\18.18.46920.0\de\ConfigurationPackages
Copying Test Assemblies
C:\dl\onprem\18.18.46920.0\platform\Test Assemblies
Copying Applications
C:\dl\onprem\18.18.46920.0\de\Applications
Copying dependencies
Copying ReportBuilder
Importing PowerShell Modules
Restoring CRONUS Demo Database
Setting CompatibilityLevel for CRONUS on localhost\SQLEXPRESS
Modifying Business Central Service Tier Config File for Docker
Creating Business Central Service Tier
Installing SIP crypto provider: 'C:\Windows\System32\NavSip.dll'
Starting Business Central Service Tier
Importing license file
Stopping Business Central Service Tier
Installation took 145 seconds
Installation complete
Initializing...
Installing ASP.NET Core 1.1
Setting host.containerhelper.internal to 172.29.224.1 in container hosts file
Starting Container
Hostname is test-bc18
PublicDnsName is dockerhub.somecompany.com
Using Windows Authentication
Creating Self Signed Certificate
Self Signed Certificate Thumbprint 70F5BF13FC337542E58BF41F0C0180C06C3AA04F
DNS identity dockerhub.somecompany.com
Modifying Service Tier Config File with Instance Specific Settings
Modifying Service Tier Config File with settings from environment variable
Setting PublicODataBaseUrl to https://dockerhub.somecompany.com/test-bc18rest/odata
Setting PublicSOAPBaseUrl to https://dockerhub.somecompany.com/test-bc18soap/ws
Setting PublicWebBaseUrl to https://dockerhub.somecompany.com/test-bc18
Starting Service Tier
CertificateThumprint 70F5BF13FC337542E58BF41F0C0180C06C3AA04F
Registering event sources
Creating DotNetCore Web Server Instance
Using application pool name: test-bc18
Using default container name: NavWebApplicationContainer
Copy files to WWW root C:\inetpub\wwwroot\test-bc18
Create the application pool test-bc18
Create website: NavWebApplicationContainer with SSL
Update configuration: navsettings.json
Done Configuring Web Client
Creating http download site
Creating Windows user admin
Setting SA Password and enabling SA
Creating SUPER user
Enable PSRemoting and setup user for winrm
Creating self-signed certificate for winrm
Container IP Address: 172.29.237.22
Container Hostname  : test-bc18
Container Dns Name  : dockerhub.somecompany.com
Web Client          : https://dockerhub.somecompany.com/test-bc18/
Dev. Server         : https://dockerhub.somecompany.com
Dev. ServerInstance : BC

Files:
http://dockerhub.somecompany.com:8080/ALLanguage.vsix
http://dockerhub.somecompany.com:8080/certificate.cer

Container Total Physical Memory is 511.9Gb
Container Free Physical Memory is 277.0Gb

Initialization took 74 seconds
Ready for connections!
Installing ASP.NET Core 1.1
Reading CustomSettings.config from test-bc18
Creating Desktop Shortcuts for test-bc18
Container test-bc18 successfully created
Because of Traefik, the following URLs need to be used when accessing the container from outside your Docker host:
Web Client:        https://dockerhub.somecompany.com/test-bc18
SOAP WebServices:  https://dockerhub.somecompany.com/test-bc18soap
OData WebServices: https://dockerhub.somecompany.com/test-bc18rest
Dev Service:       https://dockerhub.somecompany.com/test-bc18dev
Snapshot Service:  https://dockerhub.somecompany.com/test-bc18snap
File downloads:    https://dockerhub.somecompany.com/test-bc18dl
Health check returns False, restarting container
Installing ASP.NET Core 1.1
Removing Session test-bc18
test-bc18
Waiting for container test-bc18 to be ready

Installing ASP.NET Core 1.1
Initializing...
Installing ASP.NET Core 1.1
Setting host.containerhelper.internal to 172.29.224.1 in container hosts file
Restarting Container
PublicDnsName unchanged
Hostname is test-bc18
PublicDnsName is dockerhub.somecompany.com
Using Windows Authentication
Starting Local SQL Server
Starting Internet Information Server
Starting Service Tier
Container IP Address: 172.29.233.48
Container Hostname  : test-bc18
Container Dns Name  : dockerhub.somecompany.com
Web Client          : https://dockerhub.somecompany.com/test-bc18
Dev. Server         : https://dockerhub.somecompany.com
Dev. ServerInstance : BC

Files:
http://dockerhub.somecompany.com:8080/ALLanguage.vsix
http://dockerhub.somecompany.com:8080/certificate.cer

Container Total Physical Memory is 511.9Gb
Container Free Physical Memory is 277.6Gb

Initialization took 26 seconds
Ready for connections!
Installing ASP.NET Core 1.1

Use:
Get-BcContainerEventLog -containerName test-bc18 to retrieve a snapshot of the event log from the container
Get-BcContainerDebugInfo -containerName test-bc18 to get debug information about the container
Enter-BcContainer -containerName test-bc18 to open a PowerShell prompt inside the container
Remove-BcContainer -containerName test-bc18 to remove the container again
docker logs test-bc18 to retrieve information about URL's again
...
freddydk commented 2 months ago

Every month, new versions of Windows Server, dotnet, powershell, sql etc are installed in new images. New versions of Windows also removes old versions of things.

You could try to use an old version, by specifying -useGenericImage 'mcr.microsoft.com/businesscentral:1.0.2.xx' where xx is the build, which worked for you - and let me know whether that solves the issue.

freddydk commented 2 months ago

This seems very much related to #3585 - looking at the logs, it tries to install ASP.NET 1.1 again and again. The line, which does this is:

if (!(dotnet --list-runtimes | Where-Object { $_ -like "Microsoft.NetCore.App 1.1.*" })) { Write-Host "Installing ASP.NET Core 1.1"; start-process -Wait -FilePath "c:\run\my\dotnetcore.exe" -ArgumentList /quiet }

Meaning that it never succeeds - probably because of a missing dependency. My assumption is that if you run Hyperv isolation - this will work (it does on ltsc2022 - trying ltsc2019 now)

freddydk commented 2 months ago

ltsc2019 also works with hyperv (on my host)

image

DanielGoehler commented 2 months ago

Every month, new versions of Windows Server, dotnet, powershell, sql etc are installed in new images. New versions of Windows also removes old versions of things.

Regarding Windows Server, as you pointed out, only security fixes are provided, including for the .NET Framework. The last major release with new features for .NET was version 4.8 on April 18, 2019, and .NET 4.8.1 was released on August 9, 2022, to add ARM support. PowerShell does have new versions, but in many cases, PowerShell Classic (3.0) is still used inside Windows Server 2019 and should remain stable. SQL Server continues to receive new versions.

The main issue seems to be the prerequisite components that are tied to the Business Central version and the DVD. In my experience, these versions were installed and everything worked fine.

You could try to use an old version, by specifying -useGenericImage 'mcr.microsoft.com/businesscentral:1.0.2.xx' where xx is the build, which worked for you - and let me know whether that solves the issue.

How can I view the GenericImage version? In the image section, I only see ltsc2019 and sha256 hashes. image

freddydk commented 2 months ago

docker inspect - and then find the labels section - will have a tag - that is the one.

The pre-requisites should still be downloaded and applied (I think).

If pre-requisites wasn't applied, HyperV wouldn't fix the problem.

I think the problem is incompatibilities between host and container - maybe the container comes with a newer version of things than is on the host and using process isolation, it will use stuff from the host, which isn't there.

IMO - this is where you need to search.

DanielGoehler commented 2 months ago

When using docker inspect, the only relevant tag found is 1.0.2.24:

[..]
            "Image": "mcr.microsoft.com/businesscentral:ltsc2019",
            "Volumes": null,
            "WorkingDir": "",
            "Entrypoint": null,
            "OnBuild": null,
            "Labels": {
                "country": "de",
                "created": "202406101110",
                "eula": "https://go.microsoft.com/fwlink/?linkid=861843",
                "maintainer": "Dynamics SMB",
                "nav": "",
                "osversion": "10.0.17763.5820",
                "platform": "16.0.35120.0",
                "tag": "1.0.2.24",
                [..]
                "version": "16.19.35126.0"
            }
[..]

However, when attempting to use this tag, the following error occurs:

Error response from daemon: manifest for mcr.microsoft.com/businesscentral:1.0.2.24 not found: manifest unknown: manifest tagged by "1.0.2.24" is not found

This error happens during the execution of the New-BcContainer command:

New-BcContainer `
  -accept_eula `
  -containername test-bc16-1-0-2-24 `
  -artifactUrl "https://bcartifacts-exdbf9fwegejdqak.b02.azurefd.net/onprem/16.19.35126.0/de" `
  -PublicDnsName dockerhub.somecompany.com `
  -useSSL `
  -useTraefik `
  -Credential $credential  `
  -useGenericImage 'mcr.microsoft.com/businesscentral:1.0.2.24'

Here is the full log of the error:

WARNING: Container name should not exceed 15 characters
BcContainerHelper is version 6.0.18
BcContainerHelper is running as administrator
HyperV is Enabled
Host is Microsoft Windows Server 2019 Standard - 10.0.17763.5936
UsePsSession is True
UsePwshForBc24 is False
UseWinRmSession is allow
UseSslForWinRmSession is True
Docker Client Version is 19.03.5
Docker Server Version is 19.03.5
Removing Desktop shortcuts
Fetching all docker images
Fetching all docker volumes
Enabling SSL as otherwise all clients will see mixed HTTP / HTTPS request, which will cause problems e.g. on the mobile and modern windows cl
ients
Pulling image mcr.microsoft.com/businesscentral:1.0.2.24
New-BcContainer Telemetry Correlation Id: d4a652e6-24d1-4395-88b1-8801a5359237
DockerDo : Error response from daemon: manifest for mcr.microsoft.com/businesscentral:1.0.2.24 not found: manifest unknown: manifest tagged 
by "1.0.2.24" is not found
ExitCode: 1
Commandline: docker pull  mcr.microsoft.com/businesscentral:1.0.2.24
In C:\Program Files\WindowsPowerShell\Modules\BcContainerHelper\6.0.18\ContainerHandling\New-NavContainer.ps1:799 Zeichen:13
+             DockerDo -command pull -imageName $bestImageName | Out-Nu ...
+             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [Write-Error], WriteErrorException
    + FullyQualifiedErrorId : Microsoft.PowerShell.Commands.WriteErrorException,DockerDo
freddydk commented 2 months ago

My bad - windows version is missing You get all tags by:

(get-bccontainerimagetags mcr.microsoft.com/businesscentral).tags

DanielGoehler commented 2 months ago

Thanks. What changed between versions 1.0.2.24 (10.0.17763.5820-1.0.2.24) and 1.0.2.30 (10.0.17763.5936-1.0.2.30)?

In the PowerShell Output for 1.0.2.30, I see multiple instances of "Installing ASP.NET Core 1.1," which did not appear in 1.0.2.24. Was ASP.NET Core 1.1 removed starting with 1.0.2.30?

We initially tried Hyper-V isolation, but with our centralized self-service Docker server, this uses too many resources. If we allocate too little RAM, like 8 GB, to each Docker container, Business Central crashes when SQL Server or the Service Tier temporarily needs more than 8 GB. If we assign more than 8 GB, like 16 GB or 24 GB, we run out of memory on the host. We found that with process isolation, SQL Server, the service tier, and the web server can temporarily use as much memory as they need and then return to normal when finished.

Out of 272 Docker containers, 52 are currently running, requiring a total of 259 GB of RAM. If I were to assign 16 GB to each container using Hyper-V isolation, the host would need 832 GB of RAM, which it doesn't have and didn't need until last week. These containers get shut down every Sunday, so these 52 Docker containers were started or created in the last two days.

image

We understand that there is little interest in continuing to support the old Business Central version, but we still have customers using these versions, and BCContainerHelper is an integral part of our development process using Docker and the ALOps pipeline. My colleague created issue #3585 because the pipeline failed, which led us to discover that BC15 - BC18 also didn't work anymore. I created a separate build agent with Hyper-V for these old versions, but this doesn't solve the issue that Hyper-V isolation is not the best option for a centralized self-service Docker server. Is it possible for us to create our own images with the configuration of image version 1.0.2.24 going forward?

By the way, when I run the installer, it indicates that it is installing ASP.NET Core 1.0 (1.0.7) instead of ASP.NET Core 1.1. Could this be the issue, that the wrong version is being installed?

freddydk commented 2 months ago

We understand that there is little interest in continuing to support the old Business Central version...

This has nothing do do with not supporting old versions - as we have discovered, everything works when running HyperV, meaning that you are not looking for a wrong version being installed or errors in the generic image - you are looking for incompatibilities between host and container.

The code to download ASP.NET Core is added by BcContainerHelper when it discovers that we are running BC 15 - BC18:

    if ($version.Major -ge 15 -and $version.Major -le 18 -and $genericTag -ge [System.Version]"1.0.2.15") {
        Write-Host "Patching container to install ASP.NET Core 1.1"
        Download-File -source "https://download.microsoft.com/download/6/F/B/6FB4F9D2-699B-4A40-A674-
B7FF41E0E4D2/DotNetCore.1.0.7_1.1.4-WindowsHosting.exe" -destinationFile (Join-Path $myFolder "dotnetcore.exe")
        ...

and then this code is added to the startup of the container:

if (!(dotnet --list-runtimes | Where-Object { $_ -like "Microsoft.NetCore.App 1.1.*" })) { Write-Host "Installing ASP.NET Core 1.1"; start-process -Wait -FilePath "c:\run\my\dotnetcore.exe" -ArgumentList /quiet }

Has been like that since generic image version 1.0.2.15 where this version of ASP.NET Core was removed from the image. This version was removed because of it's vulnerabilities - we have strict policies for removing things, which could cause security problems down the road (which, to be honest is my biggest concern when partners are telling me that they have customers running on old versions of BC, depending on old insecure libraries like ASP.NET)

But, as you see - I actually did add the support in BcContainerHelper to support these old versions.

The differences 1.0.2.24 (build here https://github.com/microsoft/nav-docker/actions/runs/9447261154/job/26018703384 from these source https://github.com/microsoft/nav-docker/tree/b74838e6666ea301ddc22557b16cf41c5ccf3499) and 1.0.2.30 (build here https://github.com/microsoft/nav-docker/actions/runs/9565198316 from this source https://github.com/microsoft/nav-docker/actions/runs/9565198316) are really only the Windows version. Comparing the versions in GIT (like https://github.com/microsoft/nav-docker/compare/1.0.2.24...1.0.2.30?expand=1) reveals some built technicalities (moving URLs to a file overriding them from GitHub variables) - both versions are using the same dotnet versions: image

Even 1.0.2.38 is using these dotnet versions.

You can build generic images yourself - clone the nav-docker repo and run the generic/build.ps1 script (after setting various variables) and it generates a generic image with this. With the above findings - I think your problem is the windows version and not the code in containers.

You might be in a better place with Windows Server 2022 - but I cannot guarantee that.

Again, from my perspective - if HyperV solved the issue, then I know that I am installing the right components and in an isolated world - things works. From there-on whatever problems process isolation gives you is really problems between container and host. If you can use 10.0.17763.5820-1.0.2.24 then fine with me.

Please do understand though that running old unsupported versions of BC and generic images comes with a security risk - which we cannot be held responsible for.

Note that I will be on vacation as of today for the next 15 days - so I will not be very responsive to issues here.

DanielGoehler commented 2 months ago

Thank you. Have a nice vacation.

For now, the generic image from last month still works. You are right; security vulnerabilities in Windows Server and .NET are the responsibility of the product teams and the CEO, not yours.

I'm not trying to be negative, but the recent software quality from Microsoft isn't what it used to be. Software quality and security, or the lack thereof, is not an uncontrollable natural force like rain. The decline in quality and apparent carelessness seems intentional by the product teams.

I will try to create monthly versions of a custom generic image for BC18 and older updates to continue using process isolation. If this doesn't work, Hyper-V isolation is the second-best option. If I find anything useful, I will share it here.

freddydk commented 2 months ago

Just want to say three things:

You can have your opinion on whether product teams are intentionally careless, I can only say that from where I am sitting - that is not true.

Furthermore, I do want to say that any customer staying behind on old versions of Business Central running with old vulnerable versions of the operating system, dotnet components and communication protocols - might experience quality on a daily basis, not knowing that they might be at risk due to weak links in the chain. This isn't a Microsoft nor a Business Central problem, this is how it is to make software in 2024 and one of the many reasons why we are fighting to keep customers current at all time.