docker / for-mac

Bug reports for Docker Desktop for Mac
https://www.docker.com/products/docker#/mac
2.43k stars 117 forks source link

virtiofs breaks initial startup of Postgres container #6270

Open TylerADavis opened 2 years ago

TylerADavis commented 2 years ago

Expected behavior

Enabling VirtioFS and Big Sur Virtualization framework does not cause changes in container behavior outside of speed.

Actual behavior

Program in container fails to start successfully when it depends on files mounted in shared volume. PostgreSQL complains LOG: could not open file "pg_wal/000000010000000000000001": No such file or directory and FATAL: could not open file "pg_wal/000000010000000000000001": No such file or directory. Disabling virtiofs and experimental virtualization framework allows the container to start without issue.

Information

This may be a duplicate of #6219 or #6243, but I am not sure. I am trying to run a Postgres container using docker-compose up Postgres, with this docker-compose file. This service is set to mount postgresql's data to the host's ./data/postgres directory, enabling data persistence. When I run the container in a fresh directory without virtiofs and new virtualization framework, it starts just fine. However, if I try to start it from scratch with virtiofs enabled, it fails to start with the message above. I assume virtiofs is interfering with how long it takes for items to show up and be accessible once created. Please let me know if there is any additional information I can provide.

Output of /Applications/Docker.app/Contents/MacOS/com.docker.diagnose check

❯ /Applications/Docker.app/Contents/MacOS/com.docker.diagnose check
Starting diagnostics

[PASS] DD0027: is there available disk space on the host?
[PASS] DD0028: is there available VM disk space?
[PASS] DD0031: does the Docker API work?
[PASS] DD0004: is the Docker engine running?
[PASS] DD0011: are the LinuxKit services running?
[PASS] DD0016: is the LinuxKit VM running?
[PASS] DD0001: is the application running?
[PASS] DD0018: does the host support virtualization?
[PASS] DD0017: can a VM be started?
[PASS] DD0015: are the binary symlinks installed?
[PASS] DD0003: is the Docker CLI working?
[PASS] DD0013: is the $PATH ok?
[PASS] DD0007: is the backend responding?
[PASS] DD0014: are the backend processes running?
[PASS] DD0008: is the native API responding?
[PASS] DD0009: is the vpnkit API responding?
[PASS] DD0010: is the Docker API proxy responding?
[PASS] DD0012: is the VM networking working?
[PASS] DD0032: do Docker networks overlap with host IPs?
[SKIP] DD0030: is the image access management authorized?
[PASS] DD0019: is the com.docker.vmnetd process responding?
[PASS] DD0033: does the host have Internet access?

Steps to reproduce the behavior

  1. Enable virtiofs
  2. mkdir ~/cs143
  3. cd ~/cs143
  4. mkdir ~/cs143_shared
  5. wget curl https://gist.githubusercontent.com/TylerADavis/d1fb104553740bd6b48ea2bdadd6a59d/raw/adcf9d667d5a0c9a51a243be609644d46007a546/docker-compose.yml > docker-compose.yml
  6. docker-compose up Postgres
  7. Observe error message
  8. Disable virtiofs
  9. observe that docker-compose up postgres runs without issue
TylerADavis commented 2 years ago

Just realized that 4.7.0 was realized yesterday. I have updated Docker desktop and confirm that this issue persists.

TylerADavis commented 2 years ago

I've confirmed that enabling VirtioFS is what causes the error here. The Big Sur virtualization framework on its own is not enough to cause the failure.

Atomzwieback commented 2 years ago

Can confirm. But this staging release mentioned here seems to fix it: https://github.com/docker/roadmap/issues/7#issuecomment-1073730051

OgulcanCelik commented 2 years ago

I also have this problem and latest build does not seem to fix this.

OgulcanCelik commented 2 years ago

This can be fixed by PGDATA env variable mentioned on: https://github.com/docker-library/postgres/issues/435#issuecomment-611613071

Also I was mounting /var/lib/postgresql/pgdata but changed it to one dir above to /var/lib/postgresql

version: '3'
services:
  postgres:
    image: postgres
    restart: on-failure
    environment:
      - POSTGRES_USER=postgres
      - POSTGRES_PASSWORD=password
      - PGDATA=/var/lib/postgresql/data/pgdata
      - POSTGRES_DB=postgres
    volumes:
      - ./postgres_data:/var/lib/postgresql
    ports:
      - 5432:5432
WD-Mauro commented 2 years ago

This can be fixed by PGDATA env variable mentioned on: docker-library/postgres#435 (comment)

Also I was mounting /var/lib/postgresql/pgdata but changed it to one dir above to /var/lib/postgresql

version: '3'
services:
  postgres:
    image: postgres
    restart: on-failure
    environment:
      - POSTGRES_USER=postgres
      - POSTGRES_PASSWORD=password
      - PGDATA=/var/lib/postgresql/data/pgdata
      - POSTGRES_DB=postgres
    volumes:
      - ./postgres_data:/var/lib/postgresql
    ports:
      - 5432:5432

Hi, I tried this method. But when I list "postgres_data", I find an empty "pgdata" folder. And if I try to stop, destroy & recreate my container, it's empty

Does someone obtain this behavior?

Thanks

haxoza commented 2 years ago

I had the same issue.

@OgulcanCelik's suggestion did the trick. In my case I changed

    volumes:
      - ./db/.pgdata:/var/lib/postgresql/data

to

    volumes:
      - ./db/.pgdata:/var/lib/postgresql

and it started working.

robcast commented 2 years ago

I had the same problem with Docker Desktop for Mac 4.7.0 and the postgres:11.7 container:

It seems that VirtioFS has a bug with some filesystem operations in the initialization script of Postgres.

Miha-ha commented 2 years ago

The same on Docker Desktop for Mac 4.7.1 with postgres:14-alpine

phasath commented 2 years ago

I have the same issue with Docker Desktop for Mac 4.7.1 with postgres:11-alpine =S

tunecrew commented 2 years ago

Still having this issue with Docker Desktop for Mac 4.8.0, MacOS 12.3.1 (M1), postgres 13.6-bullseye image when initialising a new database.

The db section of my compose file includes:

environment:
  - PGDATA=/data/pgdata
volumes:
  - ./data:/data

Tried various permutations on the (sub)directories with same result.

barrenechea commented 2 years ago

Same on Docker Desktop for Mac 4.8.1 😞

tunecrew commented 2 years ago

MacOS 12.4 resolves this for me - although VirtioFS seems to be killing my inter-container communications afterwards, so it is still a no-go.

Miha-ha commented 2 years ago

MacOS 12.4 Docker Desktop 4.8.2 (79419)

Starting a Postgres with an empty data folder: local-pg-1 | 2022-05-19T05:32:27.532614737Z The files belonging to this database system will be owned by user "postgres". local-pg-1 | 2022-05-19T05:32:27.532685816Z This user must also own the server process. local-pg-1 | 2022-05-19T05:32:27.532703697Z local-pg-1 | 2022-05-19T05:32:27.532715675Z The database cluster will be initialized with locale "en_US.utf8". local-pg-1 | 2022-05-19T05:32:27.532722471Z The default database encoding has accordingly been set to "UTF8". local-pg-1 | 2022-05-19T05:32:27.532733838Z The default text search configuration will be set to "english". local-pg-1 | 2022-05-19T05:32:27.532743068Z local-pg-1 | 2022-05-19T05:32:27.532748297Z Data page checksums are disabled. local-pg-1 | 2022-05-19T05:32:27.533043509Z local-pg-1 | 2022-05-19T05:32:27.534201568Z fixing permissions on existing directory /var/lib/postgresql/data ... ok local-pg-1 | 2022-05-19T05:32:27.554377671Z creating subdirectories ... ok local-pg-1 | 2022-05-19T05:32:27.565615497Z selecting dynamic shared memory implementation ... posix local-pg-1 | 2022-05-19T05:32:27.614942313Z selecting default max_connections ... 100 local-pg-1 | 2022-05-19T05:32:27.657016799Z selecting default shared_buffers ... 128MB local-pg-1 | 2022-05-19T05:32:28.169747869Z selecting default time zone ... UTC local-pg-1 | 2022-05-19T05:32:28.201547956Z creating configuration files ... ok local-pg-1 | 2022-05-19T05:32:30.838963387Z running bootstrap script ... ok local-pg-1 | 2022-05-19T05:32:31.970048368Z sh: locale: not found local-pg-1 | 2022-05-19T05:32:31.977715960Z 2022-05-19 05:32:31.977 UTC [32] WARNING: no usable system locales were found local-pg-1 | 2022-05-19T05:32:37.517561745Z performing post-bootstrap initialization ... ok local-pg-1 | 2022-05-19T05:32:38.889777268Z initdb: warning: enabling "trust" authentication for local connections local-pg-1 | 2022-05-19T05:32:38.889814185Z You can change this by editing pg_hba.conf or using the option -A, or local-pg-1 | 2022-05-19T05:32:38.889820344Z --auth-local and --auth-host, the next time you run initdb. local-pg-1 | 2022-05-19T05:32:38.889785307Z syncing data to disk ... ok local-pg-1 | 2022-05-19T05:32:38.889832132Z local-pg-1 | 2022-05-19T05:32:38.889835393Z local-pg-1 | 2022-05-19T05:32:38.889838409Z Success. You can now start the database server using: local-pg-1 | 2022-05-19T05:32:38.889841545Z local-pg-1 | 2022-05-19T05:32:38.889844520Z pg_ctl -D /var/lib/postgresql/data -l logfile start local-pg-1 | 2022-05-19T05:32:38.889848258Z local-pg-1 | 2022-05-19T05:32:38.933962644Z waiting for server to start....2022-05-19 05:32:38.933 UTC [38] FATAL: data directory "/var/lib/postgresql/data" has wrong ownership local-pg-1 | 2022-05-19T05:32:38.934041494Z 2022-05-19 05:32:38.933 UTC [38] HINT: The server must be started by the user that owns the data directory. local-pg-1 | 2022-05-19T05:32:39.021674471Z pg_ctl: could not start server local-pg-1 | 2022-05-19T05:32:39.021820412Z Examine the log output. local-pg-1 | 2022-05-19T05:32:39.021774328Z stopped waiting

ajeetraina commented 2 years ago

If you're still facing this issue, I recommend you to upgrade your macOS to 12.4 release. Tried it with Docker Desktop for Mac 4.8.2 and VirtioFS enabled, it worked for me.

TylerADavis commented 2 years ago

With macOS 12.4 and Docker Desktop 4.8.2 I am no longer able to reproduce this error using the steps in my original post.

Edit: See https://github.com/docker/for-mac/issues/6270#issuecomment-1138663903, this isn't fully resolved for me

tunecrew commented 2 years ago

Update - I have an M1 Mac Mini and an M1 MBP configure identically - MacOS 12.4, Docker 4.8.2.

With VirtioFS enabled, I experience the following:

I have tried mapping the PGDATA env variable in my compose file one level below the volume mapping, this doesn't have any effect on the aforementioned behaviour.

TylerADavis commented 2 years ago

@tunecrew thanks for mentioning that, I'm seeing the same behavior actually. Starting with the initial empty data dir works, but subsequent launches work only intermittently.

tunecrew commented 2 years ago

Not fixed by Docker 4.9

Update - I have an M1 Mac Mini and an M1 MBP configure identically - MacOS 12.4, Docker 4.8.2.

With VirtioFS enabled, I experience the following:

  • Starting Postgres w/ an empty data dir works, and it initialises as expected
  • Restarting the container results in:
db-dev       | 2022-05-26 14:46:41.643 UTC [1] FATAL:  data directory "/data/dev" has wrong ownership
db-dev       | 2022-05-26 14:46:41.643 UTC [1] HINT:  The server must be started by the user that owns the data directory.
db-dev exited with code 1
  • Except (annoyingly) on the MBP but not the Mini, it will actually restart successfully maybe every second or third time

I have tried mapping the PGDATA env variable in my compose file one level below the volume mapping, this doesn't have any effect on the aforementioned behaviour.

huanghuanghuhu commented 2 years ago

MacOs 12.4, Docker 4.9.0 Still facing this issue when second start the container.

timur-gilauri commented 2 years ago

MacOs 12.4, Docker 4.10.1 Still facing this issue when second start the container.

ddzobov commented 2 years ago

Same thing with mongodb container

stijnjanmaat commented 2 years ago

Still same issue on MacOS 12.4, Docker 4.11.1

richterd commented 2 years ago

Hi folks, You might want to try switching from bind mount to a docker volume (see here for more information: Docker Volumes)

You can do the following to switch:

  1. Backup your database
  2. Change your docker compose as follows (this example only shows the changes, the volume name can be whatever you prefer, but has to be unique within your docker installation):
#OLD
services:
  db:
    volumes:
      - ./db:/var/lib/mysql
#NEW
volumes:
  my_db_data:
services:
  db:
    volumes:  
      - my_db_data:/var/lib/mysql
  1. Restore your database
wolflu05 commented 2 years ago

This is just a workaround, not a actual fix of the problem.

NumanIbnMazid commented 2 years ago

MacOS 12.5.1 (M1), Docker Desktop 4.11.1. Still facing the same issue.

tunecrew commented 2 years ago

MacOS 12.5.1 (M1), Docker Desktop 4.12.0. Not fixed.

tagplus5 commented 2 years ago

MacOS 12.5.1 (Intel), Docker Desktop 4.12.0. Not fixed.

sgbett commented 1 year ago

Kind of working, but only because I have restart: always in my docker-compose.yml

fails several times during startup, then eventually works:

(MacOS 12.5.1 (Intel), Docker Desktop 4.13.1)

postgres-db-1  | PostgreSQL Database directory appears to contain a database; Skipping initialization
postgres-db-1  | 
postgres-db-1  | 2022-11-07 11:33:21.071 UTC [1] FATAL:  data directory "/var/lib/postgresql/data/pgdata" has wrong ownership
postgres-db-1  | 2022-11-07 11:33:21.071 UTC [1] HINT:  The server must be started by the user that owns the data directory.
postgres-db-1 exited with code 1
postgres-db-1  | 
postgres-db-1  | PostgreSQL Database directory appears to contain a database; Skipping initialization
postgres-db-1  | 
postgres-db-1  | 2022-11-07 11:33:24.807 UTC [1] FATAL:  data directory "/var/lib/postgresql/data/pgdata" has wrong ownership
postgres-db-1  | 2022-11-07 11:33:24.807 UTC [1] HINT:  The server must be started by the user that owns the data directory.
postgres-db-1 exited with code 1
postgres-db-1  | 
postgres-db-1  | PostgreSQL Database directory appears to contain a database; Skipping initialization
postgres-db-1  | 
postgres-db-1  | 2022-11-07 11:33:29.409 UTC [1] FATAL:  data directory "/var/lib/postgresql/data/pgdata" has wrong ownership
postgres-db-1  | 2022-11-07 11:33:29.409 UTC [1] HINT:  The server must be started by the user that owns the data directory.
postgres-db-1 exited with code 1
postgres-db-1  | 
postgres-db-1  | PostgreSQL Database directory appears to contain a database; Skipping initialization
postgres-db-1  | 
postgres-db-1  | 2022-11-07 11:33:34.530 UTC [1] LOG:  starting PostgreSQL 12.5 (Debian 12.5-1.pgdg100+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit
postgres-db-1  | 2022-11-07 11:33:34.531 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
postgres-db-1  | 2022-11-07 11:33:34.531 UTC [1] LOG:  listening on IPv6 address "::", port 5432
postgres-db-1  | 2022-11-07 11:33:34.534 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
postgres-db-1  | 2022-11-07 11:33:34.585 UTC [30] LOG:  database system was shut down at 2022-11-07 11:30:49 UTC
postgres-db-1  | 2022-11-07 11:33:34.606 UTC [1] LOG:  database system is ready to accept connections
GeorgiKeranov commented 1 year ago

MacOS 13.0.1 (M1), Docker Desktop 4.13.1. Not fixed.

EDIT: In the new version of Docker Desktop - 4.14.1 I still have the problem:

2022-11-21 12:21:20 2022-11-21 10:21:20.812 UTC [1] FATAL:  data directory "/data/postgres" has wrong ownership
2022-11-21 12:21:20 2022-11-21 10:21:20.812 UTC [1] HINT:  The server must be started by the user that owns the data directory.
robcast commented 1 year ago

I still have the issue using Docker Desktop 4.14.1 on macOS 12.6.1 on M1 when running the container as the default user:

docker run --rm --volume /tmp/pgdata:/var/lib/postgresql/data -e "POSTGRES_PASSWORD=password" postgres:11-alpine

gives

The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.utf8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

fixing permissions on existing directory /var/lib/postgresql/data ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default timezone ... UTC
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... sh: locale: not found
2022-11-19 16:35:33.636 UTC [31] WARNING:  no usable system locales were found
ok
syncing data to disk ... ok

Success. You can now start the database server using:

    pg_ctl -D /var/lib/postgresql/data -l logfile start

WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.
waiting for server to start....2022-11-19 16:35:34.863 UTC [37] FATAL:  data directory "/var/lib/postgresql/data" has wrong ownership
2022-11-19 16:35:34.863 UTC [37] HINT:  The server must be started by the user that owns the data directory.

Specifying --user postgres worked sometimes but not always when running the container with an empty data directory.

Running the container again with an existing data directory worked most of the time. YMMV :-/

dgorbash commented 1 year ago

Now it works for me on Docker 4.15, macOS 12.16.1 🎉

LOG:  starting PostgreSQL 14.3 (Debian 14.3-1.pgdg110+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit
LOG:  listening on IPv4 address "0.0.0.0", port 5432
LOG:  listening on IPv6 address "::", port 5432
LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
LOG:  database system was shut down at 2022-12-06 13:36:35 UTC
LOG:  database system is ready to accept connections
PostgreSQL Database directory appears to contain a database; Skipping initialization
GeorgiKeranov commented 1 year ago

It has been fixed in the new Docker version - 4.15.0! Thank you :)

sgbett commented 1 year ago

so far so good - starts first time every time 4.15.0 for me.

alexis-vannot commented 1 year ago

The problem is still present on :

sgbett commented 1 year ago

I am on that version and it is still working for me so far.

Is it definitely not the The server must be started by the user that owns the data directory. issue? ie data directory "/data/postgres" has wrong ownership

sgbett commented 1 year ago

Just upgraded to 4.17.0 (99724)

Still working first time for me...

2023-02-28 23:01:21 
2023-02-28 23:01:21 PostgreSQL Database directory appears to contain a database; Skipping initialization
2023-02-28 23:01:21 
2023-02-28 23:01:22 2023-02-28 23:01:22.178 UTC [1] LOG:  starting PostgreSQL 12.5 (Debian 12.5-1.pgdg100+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit
2023-02-28 23:01:22 2023-02-28 23:01:22.181 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2023-02-28 23:01:22 2023-02-28 23:01:22.182 UTC [1] LOG:  listening on IPv6 address "::", port 5432
2023-02-28 23:01:22 2023-02-28 23:01:22.204 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2023-02-28 23:01:22 2023-02-28 23:01:22.502 UTC [27] LOG:  database system was shut down at 2023-02-28 22:33:17 UTC
2023-02-28 23:01:22 2023-02-28 23:01:22.618 UTC [1] LOG:  database system is ready to accept connections
laduke commented 1 year ago

I just updated to 4.17.0 and now it's Not working. I don't remember what I was on before, but it was working.

I think 4.17 worked randomly once or twice. If I switch to gRPC FUSE, it works.

I'm on Intel BTW

realies commented 1 year ago

M1, v4.19.0, VirtioFS

    user: ${UID}:${GID}
    image: postgres:latest
    volumes:
      - ./pgsql:/var/lib/postgresql/data
pgsql | [1] FATAL:  data directory "/var/lib/postgresql/data" has wrong ownership
pgsql | [1] HINT:  The server must be started by the user that owns the data directory.
MauriceArikoglu commented 1 year ago

I have the same issue on v4.21.1 / VirtioFS. Switching back to gRPC FUSE fixed it for me. Not a solution, but maybe a workaround for some until it is fixed. Apple Silicon, 13.2

daweimau commented 1 year ago

M1 Apple chip

4.11.0 + VFS: Affected consistently ❌ 4.11.0 + No VFS: Could not repro 👍 4.23.0 + VFS: Could not repro 👍 4.23.0 + No VFS: Could not repro 👍

Will edit if this proves incorrect over time in my case

NeverWalkAloner commented 1 year ago

Just faced this issue with 4.23.0 version on Mac M1. Switching back to gRPC FUSE fixed the problem.

ickeundso commented 1 year ago

Still a problem with 4.24.2 (124339) on Mac M1, using gRPC on this version fixed the prostgresql startup issue.

netsensei commented 9 months ago

Facing this issue with Docker v4.26.1 @ Apple M2 / Sonoma 14.2.1 with VirtioFS enabled. Switching to gRPC FUSE works for me.

RenanFG commented 6 months ago

changing the mount point "/var/lib/postgresql/data" to "/var/lib/postgresql" fixed for me. Docker image Version 15.2

hwsungsoft commented 3 months ago

changing the mount point "/var/lib/postgresql/data" to "/var/lib/postgresql" fixed for me. Docker image Version 15.2

Thank you very much, it solved my problem.

robmoore-i commented 1 month ago

Edit: Our problem is a bit different. Created a new issue: https://github.com/docker/for-mac/issues/7415

My team is still having this problem. We're on MacOS 14.6.1 using Docker Desktop 4.34.0 and we're using the image postgres:16.4-bookworm.

We mount an empty directory onto /mnt/data and set PGDATA=/mnt/data/postgresql. We create PGDATA on container startup (using mkdir), before invoking the default entrypoint with exec /usr/local/bin/docker-entrypoint.sh postgres.

I've added the following logs to the container entrypoint at startup (using id and stat):

# echo "Running as user '$(whoami)' ($(id))"
Running as user 'postgres' (uid=999(postgres) gid=999(postgres) groups=999(postgres),101(ssl-cert))

# stat -c "name=(%n) file_type=(%F) owner_group=(%g/%G) owner_user=(%u/%U) permission_bits=(%A) mount_point=(%m)" "$PGDATA"
name=(/mnt/data/postgresql) file_type=(directory) owner_group=(999/postgres) owner_user=(999/postgres) permission_bits=(drwx------) mount_point=(/mnt/data)

So the server is being started by the user who owns the data directory.

And yet when the database starts, it immediately crashes:

FATAL:  data directory "/mnt/data/postgresql" has wrong ownership
HINT:  The server must be started by the user that owns the data directory

Changing the file sharing implementation in Docker Desktop from VirtioFS to gRPC FUSE fixes the problem but is a bad solution because it requires the team to know about the problem and know how to fix it. This seems like a bug in Docker. Let me know if I can help further, thanks.