cachix / devenv

Fast, Declarative, Reproducible, and Composable Developer Environments
https://devenv.sh
Apache License 2.0
3.56k stars 259 forks source link

Postgres does not work with process-compose #810

Closed Alexnortung closed 6 months ago

Alexnortung commented 8 months ago

Describe the bug When trying to start postgres with process-compose it does an invalid readiness check. This means that it will always be marked as not ready by process-compose and be shut down.

https://github.com/cachix/devenv/blob/ae9100ae735baf5b0c491eabf76c6a9ec0a79757/src/modules/services/postgres.nix#L303C24-L303C24

Would it be possible to change the readiness_probe?

To reproduce Please provide an Short, Self Contained, Correct (Compilable), Example by creating a gist using devenv.nix, devenv.yaml, and optionally devenv.lock.

https://gist.github.com/Alexnortung/566e822547189e6d2a5b1a1547caa6b4

Version

devenv: 0.6.3

domenkozar commented 8 months ago

Relevant #453

thenonameguy commented 8 months ago

The problem is you are explicitly disabling UNIX socket-based connections, which are used by default (via $PGDATA) for the readiness probe: https://gist.github.com/Alexnortung/566e822547189e6d2a5b1a1547caa6b4#file-flake-nix-L43

Since you are deviating from the basic configuration, it's expected to have to patch some things.

bigluck commented 7 months ago

@thenonameguy I'm having a similar problem with process-compose + postgres, using devenv: v0.6.3

In my devenv.nix file I've configured the postgres server using:


  services.postgres = {
    enable = true;
    package = pkgs.postgresql_15;
    listen_addresses = "127.0.0.1";
    port = 35431;
    initialDatabases = [{ name = "my_db"; } ];
    initialScript = ''
      CREATE USER my_user WITH PASSWORD 'my_pass';

      GRANT ALL PRIVILEGES ON DATABASE my_db TO postgres;
      ALTER DATABASE my_db OWNER TO my_user;
    '';
    settings = {
      unix_socket_directories = "/Users/bigluck";
    };
  };

And this is my "dev" process:

  processes = {
    dev = {
      exec = "my-app-run";
      process-compose = {
        depends_on.postgres.condition = "process_ready";
      };
    };
  };

When I do devenv up the postgres service stay on state "Non Ready" for ~1m, but then it is killed:

Error: readiness check fail - exit status 2

However, if I restart the service and look in my home directory, I can see the socket file:

srwxrwxrwx@   1 bigluck  staff      0 Oct  9 14:48 .s.PGSQL.35431
-rw-------@   1 bigluck  staff    148 Oct  9 14:48 .s.PGSQL.35431.lock

and the pg_isready command works as expected:

$ pg_isready -h /Users/bigluck -d template1
/Users/bigluck:35431 - accepting connections
thenonameguy commented 7 months ago

Hey @bigluck,

Yours is a misconfiguration as well, albeit a different kind. Your comment assumes that by overwriting the unix_socket_directories setting, the readiness probe will automatically pick up the correct unix socket directory. This is not the case.

Let's walk backwards. Here is the readiness probe command: https://github.com/cachix/devenv/blob/bd859ef4b207c2071f5bd3cae2a74f4d3e69c2e2/src/modules/services/postgres.nix#L303

It uses PGDATA env var, let's see where that is defined: https://github.com/cachix/devenv/blob/bd859ef4b207c2071f5bd3cae2a74f4d3e69c2e2/src/modules/services/postgres.nix#L284

Aha! So while the postgres creates the listener sockets in your home folder, the readiness probe still uses the .devenv/state/postgres folder to try to connect. This fails.

Now you have (at least) 2 options.

  1. Override the readiness probe command:

    { pkgs, lib, config, ... }:
    { 
    ...
    processes.postgres.process-compose.readiness_probe.exec.command = 
    lib.mkForce "pg_isready -h ${config.services.postgres.unix_socket_directories} -d template1";
    ...
    }

    This would yield what you expected to happen. (I assume you run this in a devenv activated shell, so the pg_isready is on your PATH, should be fine.)

  2. Change PGDATA:

    env.PGDATA = lib.mkForce config.services.postgres.unix_socket_directories;

    This will put the whole of Postgres install outside of your project folder, which is not something you might want. But hey, it's shorter :)

Hope this helps!

bigluck commented 7 months ago

Oh waw, thanks @thenonameguy for your quick reply. The reason why I'm overwriting the unix_socket_directories config is that I'm not able to have the Postgres HealthCheck working out of the box.

Indeed if I remove:

    settings = {
      unix_socket_directories = "/Users/bigluck";
    };

my configuration become very closed to the default one, but process-compose still kills my process after ~1m.

I'm on a Mac, and the project folder is: /Users/bigluck/dev/my_company/my_repository_name/my_project_name/service

When I start the server (or restart the postgres process using process-compose) I can see the socket on the devenv folder:

$ ls -la /Users/bigluck/dev/my_company/my_repository_name/my_project_name/service/.devenv/state/postgres
total 72
drwx------@ 28 bigluck  staff   896 Oct  9 15:59 .
drwxr-xr-x@  5 bigluck  staff   160 Oct  9 13:56 ..
srwxrwxrwx   1 bigluck  staff     0 Oct  9 15:59 .s.PGSQL.35431
-rw-------   1 bigluck  staff   135 Oct  9 15:59 .s.PGSQL.35431.lock

but pg_isready in this case fails:

$ pg_isready -h /Users/bigluck/dev/my_company/my_repository_name/my_project_name/service/.devenv/state/postgres -d template1
/Users/bigluck/dev/my_company/my_repository_name/my_project_name/service/.devenv/state/postgres:35431 - no response

I would like to be as strict as possible to the default configuration, but for some reason I'm unable to close the loop

thenonameguy commented 7 months ago

Hmm, strange. I'm also on MacOS (M1). Bit of a self-promote here, but could you try cloning this repo? https://github.com/schemamap/schemamap

Then running devenv up. This works for me flawlessly.

bigluck commented 7 months ago

I was copying from your repo indeed :) Your repo works, now I need to understand what's different...

thenonameguy commented 7 months ago

Maybe try testing through the TCP/IP stack?

processes.postgres.process-compose.readiness_probe.exec.command = 
  lib.mkForce "pg_isready -h 127.0.0.1 -d template1";

I'm assuming your dependency will use that anyway, since you defined the listen_address.

thenonameguy commented 7 months ago

Also try starting from scratch, maybe your init script has some issues:

pkill process-compose
git clean -xf $PGDATA
devenv up
bigluck commented 7 months ago

Even copying your entire postgres config it fails on my project, that's funny!

The only major difference I can see is that doing devenv up on my project I see:

starting PostgreSQL 15.3 on aarch64-apple-darwin22.4.0, compiled by clang version 11.1.0, 64-bit
# where is pg_isready?
$ which pg_isready
/nix/store/hsrgl6vvylj62ndzqzd6hfzcz3ia799y-devenv-profile/bin/pg_isready

but in your project I've got

starting PostgreSQL 15.3 on aarch64-apple-darwin22.6.0, compiled by clang version 11.1.0, 64-bit
# where is pg_isready?
$ which pg_isready
/nix/store/02mw0wmsldijyshvvcw6k18ss5y70nbi-devenv-profile/bin/pg_isready
bigluck commented 7 months ago

The root of all the problems seems correlated with this funny error that I get during the DB initialization: Unix-domain socket path ".." is too long (maximum 103 bytes)

bigluck commented 7 months ago

BTW seems that processes.postgres.process-compose.readiness_probe.exec.command does not overwrite the default probe.

curl -X 'GET' \
  'http://127.0.0.1:9999/process/info/postgres' \
  -H 'accept: application/json'

{
  "Name": "postgres",
  "Disabled": false,
  "IsDaemon": false,
  "Command": "exec /nix/store/jlzf5jk4dcgp3s6wa2xn7v1kagylgpjd-postgres",
  "LogLocation": "",
  "Environment": null,
  "RestartPolicy": {
    "Restart": "on_failure",
    "BackoffSeconds": 0,
    "MaxRestarts": 0,
    "ExitOnEnd": false
  },
  "DependsOn": null,
  "LivenessProbe": null,
  "ReadinessProbe": {
    "Exec": {
      "Command": "/nix/store/im38gmzx14174lzhwxv1mkxj7k4la9zb-postgresql-15.4/bin/pg_isready -h /Users/bigluck/dev/my_company/my_repository_name/my_project_name/service/.devenv/state/postgres -d template1",
      "WorkingDir": ""
    },
    "HttpGet": null,
    "InitialDelay": 2,
    "PeriodSeconds": 10,
    "TimeoutSeconds": 4,
    "SuccessThreshold": 1,
    "FailureThreshold": 5
  },
  "ShutDownParams": {
    "ShutDownCommand": "",
    "ShutDownTimeout": 0,
    "Signal": 2,
    "ParentOnly": false
  },
  "DisableAnsiColors": false,
  "WorkingDir": "",
  "Namespace": "default",
  "Replicas": 1,
  "Extensions": null,
  "ReplicaNum": 0,
  "ReplicaName": "postgres"
}

:(

bigluck commented 7 months ago

Found:

  process-managers.process-compose.settings.processes = {
    postgres.readiness_probe.exec.command = lib.mkForce "pg_isready -h 127.0.0.1 -p 35431 -d template1";
  };
bigluck commented 7 months ago

But it only partially fixed the problem. In general, there is a bug on the Postgres implementation on devenv when the socket is stored on a folder with a very long path.

In my case the init script fails with:

Unix-domain socket path ".." is too long (maximum 103 bytes)
Screenshot 2023-10-09 at 19 51 19

But then the service starts, and the overwritten readiness_probe generates a valid health status, but the app fails because the user is not authorized to access the service (due to the previous error).

thenonameguy commented 7 months ago

Yup, no two ways about it. Try my PGDATA option 2 from above. Should fix it, without needing to override the readiness probe.

bigluck commented 7 months ago

It should not work due to:

https://github.com/cachix/devenv/blob/bd859ef4b207c2071f5bd3cae2a74f4d3e69c2e2/src/modules/services/postgres.nix#L97-L102

thenonameguy commented 6 months ago

@domenkozar I think this ticket can be closed: There were 2 issues contained:

  1. In case of explicitly deviating from the defaults of devenv by disabling Postgres unix socket support, you cannot connect to Postgres via the default readiness probe (which uses unix sockets)
  2. In case of having a long CWD for running Postgres, unix socket support does not work

Both of these are exceptional cases which are IMO one way or another "user error". Error handling could be better, but the statement that any of these two cases constitute 'Postgres not working with process-compose' is incorrect. Any process manager that actually bothers implementing readiness probes would run into the same issue: Postgres does not work within these scenarios.