testcontainers / testcontainers-dotnet

A library to support tests with throwaway instances of Docker containers for all compatible .NET Standard versions.
https://dotnet.testcontainers.org
MIT License
3.77k stars 274 forks source link

[Bug]: MSSQL Container crashes on Ubuntu 24.04 (Colima) #1248

Closed jwedel closed 1 month ago

jwedel commented 1 month ago

Testcontainers version

3.9.0

Using the latest Testcontainers version?

Yes

Host OS

MacOS

Host arch

ARM

.NET version

8.0.302

Docker version

Client:
 Cloud integration: v1.0.35+desktop.10
 Version:           25.0.2
 API version:       1.44
 Go version:        go1.21.6
 Git commit:        29cf629
 Built:             Thu Feb  1 00:18:45 2024
 OS/Arch:           darwin/arm64
 Context:           colima

Server: Docker Engine - Community
 Engine:
  Version:          27.1.1
  API version:      1.46 (minimum version 1.24)
  Go version:       go1.21.12
  Git commit:       cc13f95
  Built:            Tue Jul 23 19:57:14 2024
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.7.19
  GitCommit:        2bf793ef6dc9a18e00cb12efb64355c2c9d5eb41
 runc:
  Version:          1.7.19
  GitCommit:        v1.1.13-0-g58aa920
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

Docker info

Client:
 Version:    25.0.2
 Context:    colima
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.12.1-desktop.4
    Path:     /Users/wej2be/.docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.24.3-desktop.1
    Path:     /Users/wej2be/.docker/cli-plugins/docker-compose
  debug: Get a shell into any image or container. (Docker Inc.)
    Version:  0.0.22
    Path:     /Users/wej2be/.docker/cli-plugins/docker-debug
  dev: Docker Dev Environments (Docker Inc.)
    Version:  v0.1.0
    Path:     /Users/wej2be/.docker/cli-plugins/docker-dev
  extension: Manages Docker extensions (Docker Inc.)
    Version:  v0.2.21
    Path:     /Users/wej2be/.docker/cli-plugins/docker-extension
  feedback: Provide feedback, right in your terminal! (Docker Inc.)
    Version:  v1.0.4
    Path:     /Users/wej2be/.docker/cli-plugins/docker-feedback
  init: Creates Docker-related starter files for your project (Docker Inc.)
    Version:  v1.0.0
    Path:     /Users/wej2be/.docker/cli-plugins/docker-init
  sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc.)
    Version:  0.6.0
    Path:     /Users/wej2be/.docker/cli-plugins/docker-sbom
  scout: Docker Scout (Docker Inc.)
    Version:  v1.3.0
    Path:     /Users/wej2be/.docker/cli-plugins/docker-scout

Server:
 Containers: 7
  Running: 0
  Paused: 0
  Stopped: 7
 Images: 8
 Server Version: 27.1.1
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 2bf793ef6dc9a18e00cb12efb64355c2c9d5eb41
 runc version: v1.1.13-0-g58aa920
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 6.8.0-39-generic
 Operating System: Ubuntu 24.04 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 6
 Total Memory: 3.818GiB
 Name: colima
 ID: 1a4caaf2-9a2c-418c-9e3e-afcd48b31029
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 HTTP Proxy: http://192.168.5.2:3128
 HTTPS Proxy: http://192.168.5.2:3128
 No Proxy: *.bosch.com,localhost,127.0.0.1
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

What happened?

Due to the fact that the Oracle XE test container is only available for x64 architectures ATM, we needed to use Colima. The oracle container works fine but the MSSQL does not start, when I just run the used image, I can see the following errors:

➜  colima docker run -e "ACCEPT_EULA=Y" -e "MSSQL_SA_PASSWORD=yourStrong\!Password" -e "MSSQL_PID=Evaluation" -p 1433:1433  --name sqlpreview --hostname sqlpreview mcr.microsoft.com/mssql/server:2019-CU18-ubuntu-20.04
SQL Server 2019 will run as non-root by default.
This container is running as user mssql.
To learn more visit https://go.microsoft.com/fwlink/?linkid=2099216.
This program has encountered a fatal error and cannot continue running at Fri Aug 30 12:20:26 2024
The following diagnostic information is available:

         Reason: 0x00000001
         Signal: SIGABRT - Aborted (6)
          Stack:
                 IP               Function
                 ---------------- --------------------------------------
                 00005657df105bec <unknown>
                 00005657df105632 <unknown>
                 00005657df104c41 <unknown>
                 00007c52aa724090 killpg+0x40
                 00007c52aa72400b gsignal+0xcb
                 00007c52aa703859 abort+0x12b
                 00005657df08e2a2 <unknown>
                 00005657df11a154 <unknown>
                 00005657df14f248 <unknown>
                 00005657df14f02a <unknown>
                 00005657df09a11a <unknown>
                 00005657df099d6f <unknown>
        Process: 9 - sqlservr
         Thread: 84 (application thread 0x140)
    Instance Id: 5ac0f825-a6d2-4c1a-a411-7c98228dcb7d
       Crash Id: 05a4bb87-7b7f-4f24-be19-0b4a95194baf
    Build stamp: f708684a2cfcb51177273c54f975ed8c62029cbc89aa962bf7d6b956f01a0c27
   Distribution: Ubuntu 20.04.5 LTS
     Processors: 6
   Total Memory: 4100014080 bytes
      Timestamp: Fri Aug 30 12:20:26 2024
     Last errno: 2
Last errno text: No such file or directory
Capturing a dump of 9
Successfully captured dump: /var/opt/mssql/log/core.sqlservr.8_30_2024_12_20_27.9
Executing: /opt/mssql/bin/handle-crash.sh with parameters
     handle-crash.sh
     /opt/mssql/bin/sqlservr
     9
     /opt/mssql/bin
     /var/opt/mssql/log/

     5ac0f825-a6d2-4c1a-a411-7c98228dcb7d
     05a4bb87-7b7f-4f24-be19-0b4a95194baf

     /var/opt/mssql/log/core.sqlservr.8_30_2024_12_20_27.9

Ubuntu 20.04.5 LTS
Capturing core dump and information to /var/opt/mssql/log...

Now, when I use mcr.microsoft.com/mssql/server:latest, it works (starting the container). I don't know which version "latest" actually is as the most recent version 2019-CU27-ubuntu-20.04 does NOT work some reason.,

However, this version is not compatible with the test container code.

To eventually get it running, I needed to change the wait strategy (notice the "-C" and the "mssql-tools18"):

  private sealed class WaitUntilMssql18Available : IWaitUntil
  {
    // command has moved from "mssql-tools" to "mssql-tools18" in more recent versions of the image 
    // -C accepts self signed cert
    private readonly string[] _command = { "/opt/mssql-tools18/bin/sqlcmd", "-C", "-Q", "SELECT 1;" }; 

    /// <inheritdoc />
    public async Task<bool> UntilAsync(IContainer container)
    {
      var execResult = await container.ExecAsync(_command)
        .ConfigureAwait(false);

      return 0L.Equals(execResult.ExitCode);
    }
  }
}

Same when executing commands:

public static class ContainerExtension {
  public static async Task<ExecResult> ExecScriptAsyncPatched(this DockerContainer container, string scriptContent, CancellationToken ct = default)
  {
    var scriptFilePath = string.Join("/", string.Empty, "tmp", Guid.NewGuid().ToString("D"), Path.GetRandomFileName());

    await container.CopyAsync(Encoding.Default.GetBytes(scriptContent), scriptFilePath, Unix.FileMode644, ct)
      .ConfigureAwait(false);

    return await container.ExecAsync(new[] { "/opt/mssql-tools18/bin/sqlcmd", "-C", "-b", "-r", "1", "-U", "sa", "-P", "yourStrong(!)Password", "-i", scriptFilePath }, ct)
      .ConfigureAwait(false);
  }
}

So, with this rather ugly hack and hardcoding the credentials, the test container starts and my tests work.

So, would it be possible to update to the latest mssql container version and adjust the exec commands? I could provide a PR for the changes, if it helps.

Relevant log output

No response

Additional information

No response

jwedel commented 1 month ago

I think it's actually related to

https://github.com/microsoft/mssql-docker/issues/881

So this seems to be a problem when MSSQL is running on Ubuntu 24.04 which colima apparently is:

wej2be@colima:/Users/wej2be/.config/colima$ cat /etc/*-release
...
PRETTY_NAME="Ubuntu 24.04 LTS"
...
HofmeisterAn commented 1 month ago

How much memory does your Colima VM have? Increase it to at least 2GB maybe even 4GB.

So, would it be possible to update to the latest mssql container version and adjust the exec commands? I could provide a PR for the changes, if it helps.

The regression from MS is already addressed in: https://github.com/testcontainers/testcontainers-dotnet/pull/1221.

jwedel commented 1 month ago

@HofmeisterAn I start it with colima start --arch x86_64 --cpu 6 --memory 4 --disk 40, so 4GB.

HofmeisterAn commented 1 month ago

@HofmeisterAn I start it with colima start --arch x86_64 --cpu 6 --memory 4 --disk 40, so 4GB.

👍 I noticed the Total Memory: 4100014080 bytes as well, but I wanted to double-check because I’ve seen similar issues in the past related to insufficient memory. However, those usually included information about an OOM IIRC. Sorry, I haven’t seen anything like this before — it’s probably an issue with Ubuntu 24.04, like you mentioned.

HofmeisterAn commented 1 month ago

Now, when I use mcr.microsoft.com/mssql/server:latest, it works (starting the container). I don't know which version "latest" actually is as the most recent version 2019-CU27-ubuntu-20.04 does NOT work some reason.,

latest corresponds to: mcr.microsoft.com/mssql/server:2022-CU14-ubuntu-22.04.

Due to the fact that the Oracle XE test container is only available for x64 architectures ATM

Can't you enable the Virtualization Framework and (install) Rosetta 2 using Docker Desktop?

To eventually get it running, I needed to change the wait strategy (notice the "-C" and the "mssql-tools18"):

As mentioned, this issue has been resolved. There are two workarounds available: https://github.com/testcontainers/testcontainers-dotnet/issues/1220#issuecomment-2249508150 and one of them you are already kind of using.

➜  colima docker run -e "ACCEPT_EULA=Y" -e "MSSQL_SA_PASSWORD=yourStrong\!Password" -e "MSSQL_PID=Evaluation" -p 1433:1433  --name sqlpreview --hostname sqlpreview mcr.microsoft.com/mssql/server:2019-CU18-ubuntu-20.04

Since the container already crashes when using the CLI, I don't think this is specifically a Testcontainers issue. Microsoft or Colima will likely need to address the issue to fix it for older versions. I don't think there's much more I can help with, so I would prefer to close the issue.

jwedel commented 1 month ago

latest corresponds to: mcr.microsoft.com/mssql/server:2022-CU14-ubuntu-22.04.

oh, I actually tried to find that information, but I couldn’t find a list with hashed to compare. But are you sure about this as ‘latest’ was having that mssql-tools regression you mentioned which I thought appeared in later versions.

Can't you enable the Virtualization Framework and (install) Rosetta 2 using Docker Desktop?

hmm, that is actually an interesting idea. I will try that. I think docker desktop even has a config flag for this. My colleague started this and I think he also found references to Colima in the testcontainers docs.

Since the container already crashes when using the CLI, I don't think this is specifically a Testcontainers issue. Microsoft or Colima will likely need to address the issue to fix it for older versions. I don't think there's much more I can help with, so I would prefer to close the issue.

Yes and know. From what I read, the problem is on mssql side. Colima just uses latest Ubuntu LTS. Mssql does some crazy stuff with memory management always has issues when the kernel changes.

coming back to testcontainers, as I found a way to get it working and I don’t know how long it will take for MS to fix it, I though my hacky changes could find its way into the next release.

as you mentioned, the mstools thing is fixed, so it would only be picking the right CU version.

HofmeisterAn commented 1 month ago

coming back to testcontainers, as I found a way to get it working and I don’t know how long it will take for MS to fix it, I though my hacky changes could find its way into the next release.

I don't think there are any tasks left. Everything should be addressed in the next release. I'll go ahead and close the issue. If I misunderstood anything, feel free to reopen it.

so it would only be picking the right CU version

As long as you're not depending on an old version, let's hope it gets fixed overall 🤞.

jwedel commented 1 month ago

Can't you enable the Virtualization Framework and (install) Rosetta 2 using Docker Desktop?

Just for reference, this does not work. There are a lot of post explaining that as of now you need to use colima to run the oracle on mac. I also just tried it with Docker Desktop and rosetta enabled but the container crashes.

I don't think there are any tasks left. Everything should be addressed in the next release.

Which version of the MSSQL image will be used for the next release?

HofmeisterAn commented 1 month ago

Which version of the MSSQL image will be used for the next release?

We do not bump module versions to avoid breaking tests for existing Testcontainers users. We recommend that users pin the version according to their needs. Just set and override the image/version using WithImage(string).

svrooij commented 1 month ago

The advise to pin a version has been around for like forever. I suggest to change the pinned version in the library to a version that it known to work, and everybody that did not pin it, might get an error.

Like mcr.microsoft.com/mssql/server:2019-CU28-ubuntu-20.04