ory / kratos

The most scalable and customizable identity server on the market. Replace your Homegrown, Auth0, Okta, Firebase with better UX and DX. Has all the tablestakes: Passkeys, Social Sign In, Multi-Factor Auth, SMS, SAML, TOTP, and more. Written in Go, cloud native, headless, API-first. Available as a service on Ory Network and for self-hosters.
https://www.ory.sh/?utm_source=github&utm_medium=banner&utm_campaign=kratos
Apache License 2.0
11.27k stars 965 forks source link

Error on startup related to migrations and courier #3997

Closed asifshiraz closed 3 months ago

asifshiraz commented 3 months ago

Preflight checklist

Ory Network Project

No response

Describe the bug

When running in docker, the messages show errors related to schema migration and courier tables

Reproducing the bug

  1. Run docker compose up with the contents as in "Relevant configuration" section

Didn't think its related to the kratos.yml config file, but I can share if needed after editing the identifiers out.

Relevant log output

root@vmi1036362:~/code/identity/docker# docker compose up
[+] Running 6/6
 ✔ Network docker_intranet            Created                                                                      0.7s
 ✔ Volume "docker_kratos-postgres"    Created                                                                      0.0s
 ✔ Container identity-postgres        Created                                                                      0.6s
 ✔ Container identity-kratos-migrate  Created                                                                      0.2s
 ✔ Container identity-kratos          Created                                                                      0.1s
Attaching to identity-kratos, identity-kratos-mailer, identity-kratos-migrate, identity-postgres
identity-postgres        | The files belonging to this database system will be owned by user "postgres".
identity-postgres        | This user must also own the server process.
identity-postgres        |
identity-postgres        | The database cluster will be initialized with locale "en_US.utf8".
identity-postgres        | The default database encoding has accordingly been set to "UTF8".
identity-postgres        | The default text search configuration will be set to "english".
identity-postgres        |
identity-postgres        | Data page checksums are disabled.
identity-postgres        |
identity-postgres        | fixing permissions on existing directory /var/lib/postgresql/data ... ok
identity-postgres        | creating subdirectories ... ok
identity-postgres        | selecting default max_connections ... 100
identity-postgres        | selecting default shared_buffers ... 128MB
identity-postgres        | selecting default timezone ... Etc/UTC
identity-postgres        | selecting dynamic shared memory implementation ... posix
identity-postgres        | creating configuration files ... ok
identity-postgres        | running bootstrap script ... ok
identity-postgres        | performing post-bootstrap initialization ... ok
identity-postgres        | syncing data to disk ... ok
identity-postgres        |
identity-postgres        | Success. You can now start the database server using:
identity-postgres        |
identity-postgres        |     pg_ctl -D /var/lib/postgresql/data -l logfile start
identity-postgres        |
identity-postgres        |
identity-postgres        | WARNING: enabling "trust" authentication for local connections
identity-postgres        | You can change this by editing pg_hba.conf or using the option -A, or
identity-postgres        | --auth-local and --auth-host, the next time you run initdb.
identity-postgres        | waiting for server to start....2024-07-14 18:18:55.893 UTC [46] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
identity-postgres        | 2024-07-14 18:18:55.971 UTC [47] LOG:  database system was shut down at 2024-07-14 18:18:55 UTC
identity-postgres        | 2024-07-14 18:18:56.039 UTC [46] LOG:  database system is ready to accept connections
identity-postgres        |  done
identity-postgres        | server started
identity-kratos-migrate  | time=2024-07-14T18:18:56Z level=warning msg=Unable to ping database, retrying. audience=application error=map[message:failed to connect to `host=postgresd user=kratos database=kratos`: dial error (dial tcp 192.168.208.2:5432: connect: connection refused)] service_name=Ory Kratos service_version=v1.2.0
identity-kratos          | time=2024-07-14T18:18:57Z level=debug msg=Adding config files. func=github.com/ory/x/configx.(*Provider).createProviders file=/go/pkg/mod/github.com/ory/x@v0.0.623/configx/provider.go:144 audience=application files=[/etc/config/kratos/kratos.yml] service_name=Ory Kratos service_version=v1.2.0
identity-kratos          | {"audience":"application","file":"/go/pkg/mod/github.com/ory/x@v0.0.623/logrusx/helper.go:125","func":"github.com/ory/x/logrusx.(*Logger).Logf","level":"info","msg":"No tracer configured - skipping tracing setup","service_name":"Ory Kratos","service_version":"v1.2.0","time":"2024-07-14T18:18:57.206272573Z"}
identity-kratos          | {"audience":"application","connMaxLifetime":0,"file":"/project/driver/registry_default.go:653","func":"github.com/ory/kratos/driver.(*RegistryDefault).Init.func1","idlePool":4,"level":"debug","msg":"Connecting to SQL Database","pool":20,"service_name":"Ory Kratos","service_version":"v1.2.0","time":"2024-07-14T18:18:57.233094495Z"}
identity-postgres        | CREATE DATABASE
identity-postgres        |
identity-postgres        |
identity-postgres        | /usr/local/bin/docker-entrypoint.sh: ignoring /docker-entrypoint-initdb.d/*
identity-postgres        |
identity-postgres        | 2024-07-14 18:18:57.283 UTC [46] LOG:  received fast shutdown request
identity-postgres        | waiting for server to shut down....2024-07-14 18:18:57.310 UTC [46] LOG:  aborting any active transactions
identity-postgres        | 2024-07-14 18:18:57.334 UTC [46] LOG:  background worker "logical replication launcher" (PID 53) exited with exit code 1
identity-postgres        | 2024-07-14 18:18:57.334 UTC [48] LOG:  shutting down
identity-postgres        | 2024-07-14 18:18:57.390 UTC [46] LOG:  database system is shut down
identity-postgres        |  done
identity-postgres        | server stopped
identity-postgres        |
identity-postgres        | PostgreSQL init process complete; ready for start up.
identity-postgres        |
identity-postgres        | 2024-07-14 18:18:57.563 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
identity-postgres        | 2024-07-14 18:18:57.563 UTC [1] LOG:  listening on IPv6 address "::", port 5432
identity-postgres        | 2024-07-14 18:18:57.575 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
identity-postgres        | 2024-07-14 18:18:57.667 UTC [64] LOG:  database system was shut down at 2024-07-14 18:18:57 UTC
identity-postgres        | 2024-07-14 18:18:57.734 UTC [1] LOG:  database system is ready to accept connections
identity-postgres        | 2024-07-14 18:18:58.567 UTC [71] ERROR:  relation "schema_migration" does not exist at character 15
identity-postgres        | 2024-07-14 18:18:58.567 UTC [71] STATEMENT:  SELECT * FROM schema_migration
identity-postgres        | 2024-07-14 18:18:58.568 UTC [71] ERROR:  relation "schema_migration" does not exist at character 21
identity-postgres        | 2024-07-14 18:18:58.568 UTC [71] STATEMENT:  select version from schema_migration
identity-postgres        | 2024-07-14 18:18:58.589 UTC [72] ERROR:  relation "networks" does not exist at character 67
identity-postgres        | 2024-07-14 18:18:58.589 UTC [72] STATEMENT:  SELECT networks.created_at, networks.id, networks.updated_at FROM networks AS networks ORDER BY created_at ASC LIMIT 1
identity-kratos          | {"audience":"application","error":{"debug":"","message":"Unable to locate the table","reason":"","stack_trace":"\ngithub.com/ory/x/sqlcon.handlePostgres\n\t/go/pkg/mod/github.com/ory/x@v0.0.623/sqlcon/error.go:59\ngithub.com/ory/x/sqlcon.HandleError\n\t/go/pkg/mod/github.com/ory/x@v0.0.623/sqlcon/error.go:78\ngithub.com/ory/x/networkx.(*Manager).Determine\n\t/go/pkg/mod/github.com/ory/x@v0.0.623/networkx/manager.go:46\ngithub.com/ory/kratos/persistence/sql.(*Persister).DetermineNetwork\n\t/project/persistence/sql/persister.go:133\ngithub.com/ory/kratos/driver.(*RegistryDefault).Init.func1\n\t/project/driver/registry_default.go:696\ngithub.com/cenkalti/backoff.RetryNotify\n\t/go/pkg/mod/github.com/cenkalti/backoff@v2.2.1+incompatible/retry.go:37\ngithub.com/cenkalti/backoff.Retry\n\t/go/pkg/mod/github.com/cenkalti/backoff@v2.2.1+incompatible/retry.go:24\ngithub.com/ory/kratos/driver.(*RegistryDefault).Init\n\t/project/driver/registry_default.go:645\ngithub.com/ory/kratos/driver.New\n\t/project/driver/factory.go:24\ngithub.com/ory/kratos/cmd/serve.NewServeCmd.func1\n\t/project/cmd/serve/root.go:26\ngithub.com/spf13/cobra.(*Command).execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:940\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:1068\ngithub.com/spf13/cobra.(*Command).Execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:992\ngithub.com/ory/kratos/cmd.Execute\n\t/project/cmd/root.go:70\nmain.main\n\t/project/main.go:14\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:271\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695","status":"Internal Server Error","status_code":500},"file":"/go/pkg/mod/github.com/ory/x@v0.0.623/logrusx/helper.go:125","func":"github.com/ory/x/logrusx.(*Logger).Logf","level":"warning","msg":"Unable to determine network, retrying.","service_name":"Ory Kratos","service_version":"v1.2.0","time":"2024-07-14T18:18:58.597023651Z"}
identity-kratos          | {"audience":"application","connMaxLifetime":0,"file":"/project/driver/registry_default.go:653","func":"github.com/ory/kratos/driver.(*RegistryDefault).Init.func1","idlePool":4,"level":"debug","msg":"Connecting to SQL Database","pool":20,"service_name":"Ory Kratos","service_version":"v1.2.0","time":"2024-07-14T18:18:58.97200175Z"}
identity-kratos          | {"audience":"application","file":"/go/pkg/mod/github.com/ory/x@v0.0.623/metricsx/middleware.go:176","func":"github.com/ory/x/metricsx.New","level":"info","msg":"Software quality assurance features are enabled. Learn more at: https://www.ory.sh/docs/ecosystem/sqa","service_name":"Ory Kratos","service_version":"v1.2.0","time":"2024-07-14T18:19:00.80622737Z"}
identity-kratos          | {"audience":"application","file":"/go/pkg/mod/github.com/ory/x@v0.0.623/logrusx/helper.go:125","func":"github.com/ory/x/logrusx.(*Logger).Logf","level":"info","msg":"TLS has not been configured for public, skipping","service_name":"Ory Kratos","service_version":"v1.2.0","time":"2024-07-14T18:19:00.845642044Z"}
identity-kratos          | {"audience":"application","file":"/go/pkg/mod/github.com/ory/x@v0.0.623/logrusx/helper.go:125","func":"github.com/ory/x/logrusx.(*Logger).Logf","level":"info","msg":"TLS has not been configured for admin, skipping","service_name":"Ory Kratos","service_version":"v1.2.0","time":"2024-07-14T18:19:00.875498589Z"}
identity-kratos          | {"audience":"application","file":"/project/cmd/courier/watch.go:94","func":"github.com/ory/kratos/cmd/courier.Watch","level":"info","msg":"Courier worker started.","service_name":"Ory Kratos","service_version":"v1.2.0","time":"2024-07-14T18:19:00.875758107Z"}
identity-kratos          | {"audience":"application","file":"/go/pkg/mod/github.com/ory/x@v0.0.623/logrusx/helper.go:125","func":"github.com/ory/x/logrusx.(*Logger).Logf","level":"info","msg":"Starting the public httpd on: 0.0.0.0:4433","service_name":"Ory Kratos","service_version":"v1.2.0","time":"2024-07-14T18:19:00.876501429Z"}
identity-kratos          | {"audience":"application","file":"/go/pkg/mod/github.com/ory/x@v0.0.623/logrusx/helper.go:125","func":"github.com/ory/x/logrusx.(*Logger).Logf","level":"info","msg":"Starting the admin httpd on: 0.0.0.0:4434","service_name":"Ory Kratos","service_version":"v1.2.0","time":"2024-07-14T18:19:00.881598164Z"}
identity-postgres        | 2024-07-14 18:19:00.888 UTC [73] ERROR:  column courier_messages.channel does not exist at character 31
identity-postgres        | 2024-07-14 18:19:00.888 UTC [73] STATEMENT:  SELECT courier_messages.body, courier_messages.channel, courier_messages.created_at, courier_messages.id, courier_messages.nid, courier_messages.recipient, courier_messages.send_count, courier_messages.status, courier_messages.subject, courier_messages.template_data, courier_messages.template_type, courier_messages.type, courier_messages.updated_at FROM courier_messages AS courier_messages WHERE nid = $1 AND status = $2 ORDER BY created_at ASC LIMIT 10
identity-postgres        | 2024-07-14 18:19:01.657 UTC [73] ERROR:  column courier_messages.channel does not exist at character 31
identity-postgres        | 2024-07-14 18:19:01.657 UTC [73] STATEMENT:  SELECT courier_messages.body, courier_messages.channel, courier_messages.created_at, courier_messages.id, courier_messages.nid, courier_messages.recipient, courier_messages.send_count, courier_messages.status, courier_messages.subject, courier_messages.template_data, courier_messages.template_type, courier_messages.type, courier_messages.updated_at FROM courier_messages AS courier_messages WHERE nid = $1 AND status = $2 ORDER BY created_at ASC LIMIT 10
identity-postgres        | 2024-07-14 18:19:02.755 UTC [73] ERROR:  column courier_messages.channel does not exist at character 31
identity-postgres        | 2024-07-14 18:19:02.755 UTC [73] STATEMENT:  SELECT courier_messages.body, courier_messages.channel, courier_messages.created_at, courier_messages.id, courier_messages.nid, courier_messages.recipient, courier_messages.send_count, courier_messages.status, courier_messages.subject, courier_messages.template_data, courier_messages.template_type, courier_messages.type, courier_messages.updated_at FROM courier_messages AS courier_messages WHERE nid = $1 AND status = $2 ORDER BY created_at ASC LIMIT 10
identity-postgres        | 2024-07-14 18:19:04.222 UTC [73] ERROR:  column courier_messages.channel does not exist at character 31
identity-postgres        | 2024-07-14 18:19:04.222 UTC [73] STATEMENT:  SELECT courier_messages.body, courier_messages.channel, courier_messages.created_at, courier_messages.id, courier_messages.nid, courier_messages.recipient, courier_messages.send_count, courier_messages.status, courier_messages.subject, courier_messages.template_data, courier_messages.template_type, courier_messages.type, courier_messages.updated_at FROM courier_messages AS courier_messages WHERE nid = $1 AND status = $2 ORDER BY created_at ASC LIMIT 10
identity-postgres        | 2024-07-14 18:19:05.238 UTC [73] ERROR:  column courier_messages.channel does not exist at character 31
identity-postgres        | 2024-07-14 18:19:05.238 UTC [73] STATEMENT:  SELECT courier_messages.body, courier_messages.channel, courier_messages.created_at, courier_messages.id, courier_messages.nid, courier_messages.recipient, courier_messages.send_count, courier_messages.status, courier_messages.subject, courier_messages.template_data, courier_messages.template_type, courier_messages.type, courier_messages.updated_at FROM courier_messages AS courier_messages WHERE nid = $1 AND status = $2 ORDER BY created_at ASC LIMIT 10
identity-kratos-migrate  | Successfully applied SQL migrations!
identity-kratos-migrate exited with code 0

Relevant configuration

version: '3.7'
services:
  kratos-migrate:
    image: oryd/kratos:v1.2.0
    container_name: identity-kratos-migrate
    env_file:
      - .env
    depends_on:
      - postgresd 
    volumes:
      - type: volume
        source: kratos-postgres
        target: /var/lib/postgres
        read_only: false
      - type: bind
        source: /var/kratos/config
        target: /etc/config/kratos
    command: -c /etc/config/kratos/kratos.yml migrate sql -e --yes
    restart: on-failure
    networks:
      - intranet
  kratos:
    depends_on:
      - kratos-migrate
    image: oryd/kratos:v1.2.0
    container_name: identity-kratos
    ports:
      - '4433:4433' # public
      - '4434:4434' # admin
    restart: unless-stopped
    env_file:
      - .env
    environment:
      LOG_LEVEL: trace
    command: serve -c /etc/config/kratos/kratos.yml --watch-courier
    volumes:
      - type: volume
        source: kratos-postgres
        target: /var/lib/postgres
        read_only: false
      - type: bind
        source: ./var/kratos/config
        target: /etc/config/kratos
    networks:
      - intranet
  postgresd:
    image: postgres:11.8
    container_name: identity-postgres
    ports:
      - "5432:5432"
    env_file:
      - .env
    environment:
      POSTGRES_USER: kratos
      POSTGRES_DB: kratos
    networks:
      - intranet

networks:
  intranet:

volumes:
  kratos-postgres:

Version

1.2.0

On which operating system are you observing this issue?

Linux

In which environment are you deploying?

Docker Compose

Additional Context

No response

mtillberg commented 3 months ago

You have a few issues here. First, the kratos-postgres volume should be mounted on the postgresd container, not the kratos containers. If you based this on the quickstart.yml, the sqlite volumes are shared between the kratos containers because sqlite is based on local files.

Second, you should look at: https://docs.docker.com/compose/startup-order/ depends_on by default only deals with execution order. In this case, the logs show that kratos is starting before kratos-migration finishes. A better solution is probably to have:

kratos:
  depends_on:
    kratos_migrate:
      condition: service_completed_successfully

so that kratos waits for the migration to complete before trying to run.

asifshiraz commented 3 months ago

Thanks. That fixed it