Open caseybrown89 opened 4 months ago
The last version of Docker Desktop where I can get SingleStore to work is 4.27.2, which is purportedly Docker engine version 25.0.3 as listed in the GUI, but seems like it might actually be 25.0.2 (according to release notes). Docker Desktop 4.28.0 is bundled with engine 25.0.3 and fails to run SingleStore in our application, which also leaves me to believe Docker Desktop 4.27.2 is actually engine version 25.0.2.
According to some Docker GH issues (PHP-FPM issue in Docker Desktop 4.27.2: WARNING: [pool www] child 85 exited on signal 11 (SIGSEGV) #7182, Mac M1 - after upgrade to Docker Desktop 4.27.1 docker container with java fails with qemu: uncaught target signal 11 (Segmentation fault) - core dumped), 25.0.3 "fixes some issues with Rosetta and QEMU". I would start there looking for what may have changed.
Describe the bug
I recently upgraded my Mac from Monterey 12.5 to Sonoma 14.5, and Docker Desktop from 4.16.1 (engine 20.10.22, compose v2.15.1) to 4.30 (engine - 26.1.1, compose v2.27.0-desktop.2)
Prior to upgrading the OS and Docker, the SingleStore database worked as expected, though was a bit on the slow side. I was able to execute integration tests against the database which included various activities like:
After upgrading, the local SingleStore database on Docker is no longer tenable. The database fails in different ways across the three steps above depending on the Docker Desktop configuration:
File sharing implementation: osxfs
time="2024-05-28T15:40:23Z" level=error msg="Error 1777 (HY000): Partition xxx:0 has no master instance. This is likely because the node or nodes that hold a copy of the partition are down. Check for offline leaf nodes by running SHOW LEAVES and bring them back online to restore access to the partition"
File sharing implementation: gRPC FUSE
time="2024-05-28T16:02:20Z" level=error msg="Error 1777 (HY000): Partition xxx:0 has no master instance. This is likely because the node or nodes that hold a copy of the partition are down. Check for offline leaf nodes by running SHOW LEAVES and bring them back online to restore access to the partition"
File sharing implementation: VirtioFS
Use Rosetta for x86_64/amd64 emulation on Apple Silicon: off
286636318 2024-05-28 16:37:28.619 ERROR: Thread 99999 (ntid 342, conn id 29): ShardingAlterTableV6: Alter Table timed out sending PREPARE messages to all the leaves
286636432 2024-05-28 16:37:28.620 WARN: Thread 99999 (ntid 342, conn id 29): operator(): Alter table on
xxx.
clienthas failed, rolling back transaction. Error: 2286: Operation ALTER timed out while waiting for concurrent operation to finish. Use SHOW PROCESSLIST to investigate long running concurrent operation, or consider increasing the value of alter statement's timeout value or the default_distributed_ddl_timeout global variable
File sharing implementation: VirtioFS
Use Rosetta for x86_64/amd64 emulation on Apple Silicon: on
==> /var/lib/memsql/ce0473ab-fc9f-45ae-a5ea-0e1c6c236947/tracelogs/memsql.log <==
276996548 2024-05-28 20:31:03.363 ERROR: Thread 99999 (ntid 293, conn id 28): ShardingAlterTableV6: Alter Table timed out sending PREPARE messages to all the leaves
276997002 2024-05-28 20:31:03.364 WARN: Thread 99999 (ntid 293, conn id 28): operator(): Alter table on
xxx.
clienthas failed, rolling back transaction. Error: 2286: Operation ALTER timed out while waiting for concurrent operation to finish. Use SHOW PROCESSLIST to investigate long running concurrent operation, or consider increasing the value of alter statement's timeout value or the default_distributed_ddl_timeout global variable
File sharing implementation: osxfs
Use Rosetta for x86_64/amd64 emulation on Apple Silicon: on
==> /var/lib/memsql/b91e2313-f74d-4cdb-847d-1bff188babe5/tracelogs/memsql.log <==
286541859 2024-05-28 20:39:57.493 ERROR: Thread 99999 (ntid 332, conn id 28): ShardingAlterTableV6: Alter Table timed out sending PREPARE messages to all the leaves
286542303 2024-05-28 20:39:57.494 WARN: Thread 99999 (ntid 332, conn id 28): operator(): Alter table on
labsengpte.
contacthas failed, rolling back transaction. Error: 2286: Operation ALTER timed out while waiting for concurrent operation to finish. Use SHOW PROCESSLIST to investigate long running concurrent operation, or consider increasing the value of alter statement's timeout value or the default_distributed_ddl_timeout global variable
File sharing implementation: VirtioFS
Use Rosetta for x86_64/amd64 emulation on Apple Silicon: on
Remove migration file causing alter failures (deadlock)
To Reproduce Steps to reproduce the behavior:
Expected behavior
I expect the database to succeed in executing the migration files and the query performance to be reasonable. Prior to the mac OS and Docker upgrade, the test suite worked locally though its runtime was high (> 10 minutes end-to-end)
Desktop (please complete the following information):
OS: macOS Sonoma 14.5
Chip: Apple M1 Max
Docker version: 26.1.1
Image tag: ghcr.io/singlestore-labs/singlestoredb-dev:0.2.21
Additional context The sql-migrate tool is being used for migration execution