hyperledger / identus-cloud-agent

Identus Cloud Agent
https://docs.atalaprism.io/
Apache License 2.0
72 stars 18 forks source link

issuer-db-1 is "healthy" but logs show FATAL on all DB, then issuer agent fails to start (no DB) #1173

Open incubiq opened 1 month ago

incubiq commented 1 month ago

Is this a regression?

No

Description

No idea if this specific error is a regression, but the cmd line to run agents that used to work does not work anymore, and the DB has Fatal errors despite showing up "healthy".

I run this command to load all agents: DOCKERHOST=host.docker.internal ADMIN_TOKEN=my-admin-token API_KEY_ENABLED=true API_KEY_AUTO_PROVISIONING=false API_KEY_AUTHENTICATE_AS_DEFAULT_USER=false DEFAULT_WALLET_ENABLED=false PORT=8100 PRISM_AGENT_VERSION=1.28.0 PRISM_NODE_VERSION=2.2.1 VAULT_DEV_ROOT_TOKEN_ID=root PG_PORT=5432 docker compose -p "issuer" -f ./infrastructure/shared/docker-compose.yml up --wait

Note that I am using an older (safer?) version 1.28.0 which used to work well.

when I run the command, I got this log for the issuer-db-1 (see last 3 lines) although the image is stated to be "healthy":

2024-06-11 08:16:21.593 UTC [1] LOG: starting PostgreSQL 13.15 (Debian 13.15-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit 2024-06-11 08:16:21.594 UTC [1] LOG: listening on IPv4 address "0.0.0.0", port 5432 2024-06-11 08:16:21.594 UTC [1] LOG: listening on IPv6 address "::", port 5432 2024-06-11 08:16:21.620 UTC [1] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" 2024-06-11 08:16:21.632 UTC [27] LOG: database system was interrupted; last known up at 2024-06-11 08:14:18 UTC 2024-06-11 08:16:23.839 UTC [27] LOG: database system was not properly shut down; automatic recovery in progress 2024-06-11 08:16:23.843 UTC [27] LOG: invalid record length at 0/15D2DF0: wanted 24, got 0 2024-06-11 08:16:23.843 UTC [27] LOG: redo is not required 2024-06-11 08:16:23.861 UTC [1] LOG: database system is ready to accept connections 2024-06-11 08:16:31.622 UTC [40] FATAL: database "agent" does not exist 2024-06-11 08:16:35.666 UTC [41] FATAL: database "pollux" does not exist 2024-06-11 08:16:35.784 UTC [42] FATAL: database "node_db" does not exist

Then the "issuer-prism-agent-1" is failing because "the DB does not exist": Failed to initialize pool: FATAL: database "pollux" does not exist

so I have 2 problems here: 1/ it used to work, what changed? 2/ it should not say it is healthy if it failed with FATAL in the logs

Please provide the exception or error you saw

see above + those logs for issuer-prism-agent-1:

2024-06-11_08:54:16.426 INFO  i.i.a.a.s.MainApp@L169:[zio-default-blocking-1] {} - zio-fiber-4 running pre-migration steps.
2024-06-11_08:54:17.015 INFO  c.z.h.HikariDataSource@L80:[ZScheduler-Worker-2] {} - DBPool - Starting...
2024-06-11_08:54:18.097 ERROR c.z.h.p.HikariPool@L594:[ZScheduler-Worker-2] {} - DBPool - Exception during pool initialization.org.postgresql.util.PSQLException: FATAL: database "pollux" does not exist
        at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2674)
        at org.postgresql.core.v3.QueryExecutorImpl.readStartupMessages(QueryExecutorImpl.java:2786)
        at org.postgresql.core.v3.QueryExecutorImpl.<init>(QueryExecutorImpl.java:172)
        at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:271)
        at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:51)
        at org.postgresql.jdbc.PgConnection.<init>(PgConnection.java:225)
        at org.postgresql.Driver.makeConnection(Driver.java:466)
        at org.postgresql.Driver.connect(Driver.java:265)
        at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:138)
        at com.zaxxer.hikari.pool.PoolBase.newConnection(PoolBase.java:364)
        at com.zaxxer.hikari.pool.PoolBase.newPoolEntry(PoolBase.java:206)
        at com.zaxxer.hikari.pool.HikariPool.createPoolEntry(HikariPool.java:476)
        at com.zaxxer.hikari.pool.HikariPool.checkFailFast(HikariPool.java:561)
        at com.zaxxer.hikari.pool.HikariPool.<init>(HikariPool.java:115)
        at com.zaxxer.hikari.HikariDataSource.<init>(HikariDataSource.java:81)
        at doobie.hikari.HikariTransactor$.fromHikariConfig$$anonfun$1(HikariTransactor.scala:79)
        at zio.interop.ZioAsync.delay$$anonfun$1(ZioAsync.scala:32)
        at zio.ZIOCompanionVersionSpecific.attempt$$anonfun$1(ZIOCompanionVersionSpecific.scala:103)
        at zio.ZIO$.suspendSucceed$$anonfun$1(ZIO.scala:4638)
        at zio.UnsafeVersionSpecific.implicitFunctionIsFunction$$anonfun$1(UnsafeVersionSpecific.scala:27)
        at zio.Unsafe$.unsafe(Unsafe.scala:37)
        at zio.ZIOCompanionVersionSpecific.succeed$$anonfun$1(ZIOCompanionVersionSpecific.scala:185)
        at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:904)
        at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
        at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
        at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
        at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:1029)
        at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
        at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:967)
        at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
        at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:1029)
        at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:967)
        at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:1029)
        at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:1029)
        at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:967)
        at zio.internal.FiberRuntime.evaluateEffect(FiberRuntime.scala:381)
        at zio.internal.FiberRuntime.evaluateMessageWhileSuspended(FiberRuntime.scala:504)
        at zio.internal.FiberRuntime.drainQueueOnCurrentThread(FiberRuntime.scala:220)
        at zio.internal.FiberRuntime.run(FiberRuntime.scala:139)
        at zio.internal.ZScheduler$$anon$4.run(ZScheduler.scala:478)

Please provide the environment you discovered this bug in

- using prism agent 1.28.0 (which used to work before)
 - moved to a new machine (ubuntu 22.04.4)  but ran the same prep config I did before on the other machine that worked
 - cannot get past this error to run the agents

Anything else?

No response

incubiq commented 1 month ago

I have now tested also with latest agent (v1.33.0) and I have a similar situation (FATAL in the DB, but "healthy" DB image)

PostgreSQL Database directory appears to contain a database; Skipping initialization

2024-06-11 09:47:13.901 UTC [1] LOG: starting PostgreSQL 13.15 (Debian 13.15-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit 2024-06-11 09:47:13.901 UTC [1] LOG: listening on IPv4 address "0.0.0.0", port 5432 2024-06-11 09:47:13.901 UTC [1] LOG: listening on IPv6 address "::", port 5432 2024-06-11 09:47:13.906 UTC [1] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" 2024-06-11 09:47:13.912 UTC [27] LOG: database system was shut down at 2024-06-11 09:42:31 UTC 2024-06-11 09:47:13.917 UTC [1] LOG: database system is ready to accept connections 2024-06-11 09:47:23.926 UTC [40] FATAL: database "agent" does not exist 2024-06-11 09:47:27.673 UTC [41] FATAL: database "pollux" does not exist 2024-06-11 09:47:27.951 UTC [42] FATAL: database "node_db" does not exist 2024-06-11 09:47:33.976 UTC [49] FATAL: database "agent" does not exist

incubiq commented 1 month ago

Sorry - it is NOT closed, it was just a click issue...

mineme0110 commented 1 month ago

@incubiq If you are on discord I would like to see if you can share your screen, I am not able to produce the error

mineme0110 commented 1 month ago

I have tried the above command DOCKERHOST=host.docker.internal ADMIN_TOKEN=my-admin-token API_KEY_ENABLED=true API_KEY_AUTO_PROVISIONING=false API_KEY_AUTHENTICATE_AS_DEFAULT_USER=false DEFAULT_WALLET_ENABLED=false PORT=8100 PRISM_AGENT_VERSION=1.28.0 PRISM_NODE_VERSION=2.2.1 VAULT_DEV_ROOT_TOKEN_ID=root PG_PORT=5432 docker compose -p "issuer" -f ./infrastructure/shared/docker-compose.yml up --wait

I was able to run so not sure what is the issue

2024-06-11 17:29:25 2024-06-11 17:29:25 ██████╗ ██████╗ ██╗███████╗███╗ ███╗ 2024-06-11 17:29:25 ██╔══██╗██╔══██╗██║██╔════╝████╗ ████║ 2024-06-11 17:29:25 ██████╔╝██████╔╝██║███████╗██╔████╔██║ 2024-06-11 17:29:25 ██╔═══╝ ██╔══██╗██║╚════██║██║╚██╔╝██║ 2024-06-11 17:29:25 ██║ ██║ ██║██║███████║██║ ╚═╝ ██║ 2024-06-11 17:29:25 ╚═╝ ╚═╝ ╚═╝╚═╝╚══════╝╚═╝ ╚═╝ 2024-06-11 17:29:25 2024-06-11 17:29:25 █████╗ ██████╗ ███████╗███╗ ██╗████████╗ 2024-06-11 17:29:25 ██╔══██╗██╔════╝ ██╔════╝████╗ ██║╚══██╔══╝ 2024-06-11 17:29:25 ███████║██║ ███╗█████╗ ██╔██╗ ██║ ██║ 2024-06-11 17:29:25 ██╔══██║██║ ██║██╔══╝ ██║╚██╗██║ ██║ 2024-06-11 17:29:25 ██║ ██║╚██████╔╝███████╗██║ ╚████║ ██║ 2024-06-11 17:29:25 ╚═╝ ╚═╝ ╚═════╝ ╚══════╝╚═╝ ╚═══╝ ╚═╝ 2024-06-11 17:29:25 2024-06-11 17:29:25 version: 2.0.19 2024-06-11 17:29:25 2024-06-11 17:29:25 2024-06-11 17:29:25 2024-06-11_16:29:25.012 INFO i.i.a.a.s.MainApp@L169:[zio-default-blocking-1] {} - zio-fiber-4 running pre-migration steps.
2024-06-11 17:29:25 2024-06-11_16:29:25.325 INFO c.z.h.HikariDataSource@L80:[ZScheduler-Worker-13] {} - DBPool - Starting... 2024-06-11 17:29:25 2024-06-11_16:29:25.386 INFO c.z.h.HikariDataSource@L82:[ZScheduler-Worker-13] {} - DBPool - Start completed. 2024-06-11 17:29:25 2024-06-11_16:29:25.532 INFO c.z.h.HikariDataSource@L350:[ZScheduler-Worker-2] {} - DBPool - Shutdown initiated... 2024-06-11 17:29:25 2024-06-11_16:29:25.536 INFO c.z.h.HikariDataSource@L352:[ZScheduler-Worker-2] {} - DBPool - Shutdown completed. 2024-06-11 17:29:25 2024-06-11_16:29:25.598 INFO c.z.h.HikariDataSource@L80:[ZScheduler-Worker-15] {} - DBPool - Starting... 2024-06-11 17:29:25 2024-06-11_16:29:25.603 INFO c.z.h.HikariDataSource@L82:[ZScheduler-Worker-15] {} - DBPool - Start completed. 2024-06-11 17:29:25 2024-06-11_16:29:25.607 INFO c.z.h.HikariDataSource@L350:[ZScheduler-Worker-3] {} - DBPool - Shutdown initiated... 2024-06-11 17:29:25 2024-06-11_16:29:25.608 INFO c.z.h.HikariDataSource@L352:[ZScheduler-Worker-3] {} - DBPool - Shutdown completed. 2024-06-11 17:29:25 2024-06-11_16:29:25.633 INFO c.z.h.HikariDataSource@L80:[ZScheduler-Worker-5] {} - DBPool - Starting... 2024-06-11 17:29:25 2024-06-11_16:29:25.636 INFO c.z.h.HikariDataSource@L82:[ZScheduler-Worker-5] {} - DBPool - Start completed. 2024-06-11 17:29:25 2024-06-11_16:29:25.640 INFO c.z.h.HikariDataSource@L350:[ZScheduler-Worker-1] {} - DBPool - Shutdown initiated... 2024-06-11 17:29:25 2024-06-11_16:29:25.640 INFO c.z.h.HikariDataSource@L352:[ZScheduler-Worker-1] {} - DBPool - Shutdown completed. 2024-06-11 17:29:25 2024-06-11_16:29:25.643 INFO i.i.a.p.s.r.Migrations@L169:[ZScheduler-Worker-13] {} - zio-fiber-4 Applying database migrations
2024-06-11 17:29:25 2024-06-11_16:29:25.664 INFO o.f.c.i.l.VersionPrinter@L37:[ZScheduler-Worker-13] {} - Flyway Community Edition 9.8.3 by Redgate 2024-06-11 17:29:25 2024-06-11_16:29:25.664 INFO o.f.c.i.l.VersionPrinter@L37:[ZScheduler-Worker-13] {} - See what's new here: https://flywaydb.org/documentation/learnmore/releaseNotes#9.8.3 2024-06-11 17:29:25 2024-06-11_16:29:25.664 INFO o.f.c.i.l.VersionPrinter@L37:[ZScheduler-Worker-13] {} - 2024-06-11 17:29:25 2024-06-11_16:29:25.675 INFO o.f.c.i.d.b.BaseDatabaseType@L37:[ZScheduler-Worker-13] {} - Database: jdbc:postgresql://db:5432/pollux (PostgreSQL 13.14) 2024-06-11 17:29:25 2024-06-11_16:29:25.697 INFO o.f.c.i.c.DbValidate@L37:[ZScheduler-Worker-13] {} - Successfully validated 13 migrations (execution time 00:00.012s) 2024-06-11 17:29:25 2024-06-11_16:29:25.702 INFO o.f.c.i.s.JdbcTableSchemaHistory@L37:[ZScheduler-Worker-13] {} - Creating Schema History table "public"."flyway_schema_history" ... 2024-06-11 17:29:25 2024-06-11_16:29:25.723 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Current version of schema "public": << Empty Schema >> 2024-06-11 17:29:25 2024-06-11_16:29:25.729 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Migrating schema "public" to version "1 - init tables" 2024-06-11 17:29:25 2024-06-11_16:29:25.747 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Migrating schema "public" to version "3 - schema registry tables" 2024-06-11 17:29:25 2024-06-11_16:29:25.764 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Migrating schema "public" to version "5 - create protocol state index" 2024-06-11 17:29:25 2024-06-11_16:29:25.769 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Migrating schema "public" to version "6 - verification policy tables" 2024-06-11 17:29:25 2024-06-11_16:29:25.785 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Migrating schema "public" to version "8 - credential schema tables" 2024-06-11 17:29:25 2024-06-11_16:29:25.798 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Migrating schema "public" to version "9 - issue credential drop subjectId not null" 2024-06-11 17:29:25 2024-06-11_16:29:25.804 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Migrating schema "public" to version "10 - presentation records add thid unique constraint" 2024-06-11 17:29:25 2024-06-11_16:29:25.808 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Migrating schema "public" to version "11 - issue credential extend schema id" 2024-06-11 17:29:25 2024-06-11_16:29:25.812 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Migrating schema "public" to version "12 - verification policies reset nonce" 2024-06-11 17:29:25 2024-06-11_16:29:25.816 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Migrating schema "public" to version "13 - credential schema tables" 2024-06-11 17:29:25 2024-06-11_16:29:25.830 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Migrating schema "public" to version "14 - add rls policy" 2024-06-11 17:29:25 2024-06-11_16:29:25.843 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Migrating schema "public" to version "15 - add anoncreds columns" 2024-06-11 17:29:25 2024-06-11_16:29:25.848 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Migrating schema "public" to version "16 - revocation status lists table and columns" 2024-06-11 17:29:25 2024-06-11_16:29:25.855 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Successfully applied 13 migrations to schema "public", now at version v16 (execution time 00:00.135s) 2024-06-11 17:29:25 2024-06-11_16:29:25.859 INFO i.i.a.c.s.r.Migrations@L169:[ZScheduler-Worker-13] {} - zio-fiber-4 Applying database migrations
2024-06-11 17:29:25 2024-06-11_16:29:25.865 INFO o.f.c.i.d.b.BaseDatabaseType@L37:[ZScheduler-Worker-13] {} - Database: jdbc:postgresql://db:5432/connect (PostgreSQL 13.14) 2024-06-11 17:29:25 2024-06-11_16:29:25.870 INFO o.f.c.i.c.DbValidate@L37:[ZScheduler-Worker-13] {} - Successfully validated 7 migrations (execution time 00:00.002s) 2024-06-11 17:29:25 2024-06-11_16:29:25.872 INFO o.f.c.i.s.JdbcTableSchemaHistory@L37:[ZScheduler-Worker-13] {} - Creating Schema History table "public"."flyway_schema_history" ... 2024-06-11 17:29:25 2024-06-11_16:29:25.885 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Current version of schema "public": << Empty Schema >> 2024-06-11 17:29:25 2024-06-11_16:29:25.886 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Migrating schema "public" to version "1 - init tables" 2024-06-11 17:29:25 2024-06-11_16:29:25.891 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Migrating schema "public" to version "2 - add thid unique constraint" 2024-06-11 17:29:25 2024-06-11_16:29:25.895 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Migrating schema "public" to version "3 - add meta retries and meta last failure" 2024-06-11 17:29:25 2024-06-11_16:29:25.898 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Migrating schema "public" to version "4 - create protocol state index" 2024-06-11 17:29:25 2024-06-11_16:29:25.901 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Migrating schema "public" to version "5 - add meta next retry" 2024-06-11 17:29:25 2024-06-11_16:29:25.903 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Migrating schema "public" to version "6 - add rls policy" 2024-06-11 17:29:25 2024-06-11_16:29:25.910 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Migrating schema "public" to version "7 - add goal and goal code" 2024-06-11 17:29:25 2024-06-11_16:29:25.912 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Successfully applied 7 migrations to schema "public", now at version v7 (execution time 00:00.028s) 2024-06-11 17:29:25 2024-06-11_16:29:25.915 INFO i.i.a.a.s.s.Migrations@L169:[ZScheduler-Worker-13] {} - zio-fiber-4 Applying database migrations
2024-06-11 17:29:25 2024-06-11_16:29:25.920 INFO o.f.c.i.d.b.BaseDatabaseType@L37:[ZScheduler-Worker-13] {} - Database: jdbc:postgresql://db:5432/agent (PostgreSQL 13.14) 2024-06-11 17:29:25 2024-06-11_16:29:25.925 INFO o.f.c.i.c.DbValidate@L37:[ZScheduler-Worker-13] {} - Successfully validated 13 migrations (execution time 00:00.003s) 2024-06-11 17:29:25 2024-06-11_16:29:25.927 INFO o.f.c.i.s.JdbcTableSchemaHistory@L37:[ZScheduler-Worker-13] {} - Creating Schema History table "public"."flyway_schema_history" ... 2024-06-11 17:29:25 2024-06-11_16:29:25.934 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Current version of schema "public": << Empty Schema >> 2024-06-11 17:29:25 2024-06-11_16:29:25.935 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Migrating schema "public" to version "1 - init tables" 2024-06-11 17:29:25 2024-06-11_16:29:25.939 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Migrating schema "public" to version "2 - did nonsecret storage" 2024-06-11 17:29:25 2024-06-11_16:29:25.944 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Migrating schema "public" to version "3 - did update state" 2024-06-11 17:29:25 2024-06-11_16:29:25.951 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Migrating schema "public" to version "4 - did hd key" 2024-06-11 17:29:25 2024-06-11_16:29:25.957 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Migrating schema "public" to version "5 - remove did rand key" 2024-06-11 17:29:25 2024-06-11_16:29:25.961 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Migrating schema "public" to version "6 - add schema id to secret table" 2024-06-11 17:29:25 2024-06-11_16:29:25.964 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Migrating schema "public" to version "7 - mutitenant wallet" 2024-06-11 17:29:25 2024-06-11_16:29:25.974 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Migrating schema "public" to version "8 - init entity repository tables" 2024-06-11 17:29:25 2024-06-11_16:29:25.978 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Migrating schema "public" to version "9 - init authentication repository tables" 2024-06-11 17:29:25 2024-06-11_16:29:25.983 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Migrating schema "public" to version "10 - create peer did table" 2024-06-11 17:29:25 2024-06-11_16:29:25.989 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Migrating schema "public" to version "11 - wallet seed digest" 2024-06-11 17:29:25 2024-06-11_16:29:25.994 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Migrating schema "public" to version "12 - generic secret" 2024-06-11 17:29:26 2024-06-11_16:29:25.999 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Migrating schema "public" to version "13 - apikey authentication improvements" 2024-06-11 17:29:26 2024-06-11_16:29:26.002 INFO o.f.c.i.c.DbMigrate@L37:[ZScheduler-Worker-13] {} - Successfully applied 13 migrations to schema "public", now at version v13 (execution time 00:00.069s) 2024-06-11 17:29:26 2024-06-11_16:29:26.004 INFO i.i.a.a.s.MainApp@L169:[ZScheduler-Worker-13] {} - zio-fiber-4 Running post-migration RLS checks for DB application users
2024-06-11 17:29:26 2024-06-11_16:29:26.049 INFO c.z.h.HikariDataSource@L80:[ZScheduler-Worker-5] {} - DBPool - Starting... 2024-06-11 17:29:26 2024-06-11_16:29:26.052 INFO c.z.h.HikariDataSource@L82:[ZScheduler-Worker-5] {} - DBPool - Start completed. 2024-06-11 17:29:26 2024-06-11_16:29:26.074 INFO c.z.h.HikariDataSource@L350:[ZScheduler-Worker-11] {} - DBPool - Shutdown initiated... 2024-06-11 17:29:26 2024-06-11_16:29:26.075 INFO c.z.h.HikariDataSource@L352:[ZScheduler-Worker-11] {} - DBPool - Shutdown completed. 2024-06-11 17:29:26 2024-06-11_16:29:26.097 INFO c.z.h.HikariDataSource@L80:[ZScheduler-Worker-10] {} - DBPool - Starting... 2024-06-11 17:29:26 2024-06-11_16:29:26.100 INFO c.z.h.HikariDataSource@L82:[ZScheduler-Worker-10] {} - DBPool - Start completed. 2024-06-11 17:29:26 2024-06-11_16:29:26.105 INFO c.z.h.HikariDataSource@L350:[ZScheduler-Worker-8] {} - DBPool - Shutdown initiated... 2024-06-11 17:29:26 2024-06-11_16:29:26.106 INFO c.z.h.HikariDataSource@L352:[ZScheduler-Worker-8] {} - DBPool - Shutdown completed. 2024-06-11 17:29:26 2024-06-11_16:29:26.124 INFO c.z.h.HikariDataSource@L80:[ZScheduler-Worker-3] {} - DBPool - Starting... 2024-06-11 17:29:26 2024-06-11_16:29:26.127 INFO c.z.h.HikariDataSource@L82:[ZScheduler-Worker-3] {} - DBPool - Start completed. 2024-06-11 17:29:26 2024-06-11_16:29:26.133 INFO c.z.h.HikariDataSource@L350:[ZScheduler-Worker-10] {} - DBPool - Shutdown initiated... 2024-06-11 17:29:26 2024-06-11_16:29:26.133 INFO c.z.h.HikariDataSource@L352:[ZScheduler-Worker-10] {} - DBPool - Shutdown completed. 2024-06-11 17:29:26 2024-06-11_16:29:26.216 INFO i.i.a.a.s.RepoModule@L169:[ZScheduler-Worker-3] {} - zio-fiber-200 Using 'postgres' as a secret storage backend
2024-06-11 17:29:26 2024-06-11_16:29:26.238 INFO c.z.h.HikariDataSource@L80:[ZScheduler-Worker-7] {} - DBPool - Starting... 2024-06-11 17:29:26 2024-06-11_16:29:26.241 INFO c.z.h.HikariDataSource@L80:[ZScheduler-Worker-6] {} - DBPool - Starting... 2024-06-11 17:29:26 2024-06-11_16:29:26.243 INFO c.z.h.HikariDataSource@L80:[ZScheduler-Worker-8] {} - DBPool - Starting... 2024-06-11 17:29:26 2024-06-11_16:29:26.246 INFO c.z.h.HikariDataSource@L82:[ZScheduler-Worker-6] {} - DBPool - Start completed. 2024-06-11 17:29:26 2024-06-11_16:29:26.246 INFO c.z.h.HikariDataSource@L82:[ZScheduler-Worker-7] {} - DBPool - Start completed. 2024-06-11 17:29:26 2024-06-11_16:29:26.248 INFO c.z.h.HikariDataSource@L82:[ZScheduler-Worker-8] {} - DBPool - Start completed. 2024-06-11 17:29:26 2024-06-11_16:29:26.310 INFO c.z.h.HikariDataSource@L80:[ZScheduler-Worker-5] {} - DBPool - Starting... 2024-06-11 17:29:26 2024-06-11_16:29:26.314 INFO c.z.h.HikariDataSource@L80:[ZScheduler-Worker-11] {} - DBPool - Starting... 2024-06-11 17:29:26 2024-06-11_16:29:26.317 INFO c.z.h.HikariDataSource@L82:[ZScheduler-Worker-11] {} - DBPool - Start completed. 2024-06-11 17:29:26 2024-06-11_16:29:26.319 INFO c.z.h.HikariDataSource@L82:[ZScheduler-Worker-5] {} - DBPool - Start completed. 2024-06-11 17:29:26 2024-06-11_16:29:26.351 INFO c.z.h.HikariDataSource@L80:[ZScheduler-Worker-9] {} - DBPool - Starting... 2024-06-11 17:29:26 2024-06-11_16:29:26.355 INFO c.z.h.HikariDataSource@L82:[ZScheduler-Worker-9] {} - DBPool - Start completed. 2024-06-11 17:29:26 2024-06-11_16:29:26.534 INFO i.i.a.a.s.AgentInitialization@L169:[ZScheduler-Worker-5] {} - zio-fiber-4 Initializing default wallet.
2024-06-11 17:29:26 2024-06-11_16:29:26.549 INFO i.i.a.a.s.AgentInitialization@L169:[zio-default-blocking-1] {} - zio-fiber-4 Default wallet not enabled.
2024-06-11 17:29:26 2024-06-11_16:29:26.559 INFO i.i.a.a.s.DidCommHttpServer@L169:[ZScheduler-Worker-9] {} - zio-fiber-753 Server Started on port 8090
2024-06-11 17:29:26 2024-06-11_16:29:26.877 INFO i.i.a.a.n.WebhookPublisher@L169:[ZScheduler-Worker-9] {} - zio-fiber-820 Polling 1 event(s)
2024-06-11 17:29:26 2024-06-11_16:29:26.877 INFO i.i.a.a.n.WebhookPublisher@L169:[ZScheduler-Worker-3] {} - zio-fiber-822 Polling 1 event(s)
2024-06-11 17:29:26 2024-06-11_16:29:26.877 INFO i.i.a.a.n.WebhookPublisher@L169:[ZScheduler-Worker-2] {} - zio-fiber-821 Polling 1 event(s)
2024-06-11 17:29:26 () 2024-06-11 17:29:26 2024-06-11_16:29:26.877 INFO i.i.a.a.n.WebhookPublisher@L169:[ZScheduler-Worker-0] {} - zio-fiber-823 Polling 1 event(s)
2024-06-11 17:29:27 2024-06-11_16:29:27.652 INFO o.h.b.c.n.NIO1SocketServerGroup@L276:[ZScheduler-Worker-5] {} - Service bound to address /[0:0:0:0:0:0:0:0]:8085 2024-06-11 17:29:27 2024-06-1116:29:27.654 INFO o.h.b.s.BlazeServerBuilder@L422:[ZScheduler-Worker-9] {} - 2024-06-11 17:29:27 2024-06-11 17:29:27 | || || | | | | 2024-06-11 17:29:27 | ' \ | | ' _ (-< 2024-06-11 17:29:27 |||\|__| ._/ ||// 2024-06-11 17:29:27 |_| 2024-06-11 17:29:27 2024-06-11_16:29:27.684 INFO o.h.b.s.BlazeServerBuilder@L425:[ZScheduler-Worker-9] {} - http4s v0.23.23 on blaze v0.23.12 started at http://[::]:8085/ 2024-06-11 17:29:54 2024-06-11_16:29:54.196 INFO i.i.a.s.c.SystemServerEndpoints@L169:[ZScheduler-Worker-8] {} - zio-fiber-891 Trace SystemServerEndpoints.healthEndpoint
2024-06-11 17:30:24 2024-06-11_16:30:24.301 INFO i.i.a.s.c.SystemServerEndpoints@L169:[ZScheduler-Worker-5] {} - zio-fiber-913 Trace SystemServerEndpoints.healthEndpoint
2024-06-11 17:30:54 2024-06-11_16:30:54.347 INFO i.i.a.s.c.SystemServerEndpoints@L169:[ZScheduler-Worker-2] {} - zio-fiber-938 Trace SystemServerEndpoints.healthEndpoint
2024-06-11 17:31:24 2024-06-11_16:31:24.416 INFO i.i.a.s.c.SystemServerEndpoints@L169:[ZScheduler-Worker-2] {} - zio-fiber-960 Trace SystemServerEndpoints.healthEndpoint
2024-06-11 17:31:54 2024-06-11_16:31:54.463 INFO i.i.a.s.c.SystemServerEndpoints@L169:[ZScheduler-Worker-1] {} - zio-fiber-985 Trace SystemServerEndpoints.healthEndpoint
2024-06-11 17:32:24 2024-06-11_16:32:24.532 INFO i.i.a.s.c.SystemServerEndpoints@L169:[ZScheduler-Worker-11] {} - zio-fiber-1013 Trace SystemServerEndpoints.healthEndpoint
2024-06-11 17:32:54 2024-06-11_16:32:54.598 INFO i.i.a.s.c.SystemServerEndpoints@L169:[ZScheduler-Worker-6] {} - zio-fiber-1038 Trace SystemServerEndpoints.healthEndpoint
2024-06-11 17:33:24 2024-06-11_16:33:24.654 INFO i.i.a.s.c.SystemServerEndpoints@L169:[ZScheduler-Worker-0] {} - zio-fiber-1060 Trace SystemServerEndpoints.healthEndpoint
2024-06-11 17:33:54 2024-06-11_16:33:54.695 INFO i.i.a.s.c.SystemServerEndpoints@L169:[ZScheduler-Worker-12] {} - zio-fiber-1088 Trace SystemServerEndpoints.healthEndpoint
2024-06-11 17:34:24 2024-06-11_16:34:24.725 INFO i.i.a.s.c.SystemServerEndpoints@L169:[ZScheduler-Worker-2] {} - zio-fiber-1121 Trace SystemServerEndpoints.healthEndpoint
2024-06-11 17:34:54 2024-06-11_16:34:54.779 INFO i.i.a.s.c.SystemServerEndpoints@L169:[ZScheduler-Worker-9] {} - zio-fiber-1146 Trace SystemServerEndpoints.healthEndpoint
2024-06-11 17:35:24 2024-06-11_16:35:24.841 INFO i.i.a.s.c.SystemServerEndpoints@L169:[ZScheduler-Worker-14] {} - zio-fiber-1177 Trace SystemServerEndpoints.healthEndpoint
2024-06-11 17:35:54 2024-06-11_16:35:54.907 INFO i.i.a.s.c.SystemServerEndpoints@L169:[ZScheduler-Worker-11] {} - zio-fiber-1202 Trace SystemServerEndpoints.healthEndpoint

incubiq commented 1 month ago

Hi - thanks for your support. I believe it is a config issue, but i cannot understand what. my discord: incubiq

incubiq commented 1 month ago

Hello @mineme0110 - any news? I tried to connect with you on Discord.

incubiq commented 4 weeks ago

Hi - I give you an update here. I have removed all docker images, volumes, etc.. .from my machine, and restarted the same command, although this time on the latest (agent node 1.33.0). And this time... no error creating the DB. There is no explanation why the first time there was an error (which then persists each time I run the docker command - like the DB may have been created but corrupt?), and a second attempt creates all correctly.

It means that:

incubiq commented 4 weeks ago

Hi - The more I go deeper, the more I have different problems, so it is likely that this ticket should then split into several ones. For now, I will keep the ticket for the "problem 2" which is that DB can be with state "healthy" when in fact it has several FATAL errors. However, I think the problem 1 happens with PRISM (as opposed to Identus) docker nodes, and is more detailed here:

Problem 1 I tried to re-install past working environments (the PRISM ones, like v1.28 or v1.31), but they all fail on the DB creation. The error I get in the DB logs is this 'docker-entrypoint-initdb.d/init-script.sh: cannot execute: required file not found' ; not really clear which file is not found (could be a clearer message to understand what problem there is). This is happening with both v1.28 and 1.31. see example of log here below:

waiting for server to start....2024-06-14 09:03:04.976 UTC [48] LOG: starting PostgreSQL 13.15 (Debian 13.15-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit 2024-06-14 09:03:04.979 UTC [48] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" 2024-06-14 09:03:04.985 UTC [49] LOG: database system was shut down at 2024-06-14 09:03:04 UTC 2024-06-14 09:03:04.989 UTC [48] LOG: database system is ready to accept connections done server started

/usr/local/bin/docker-entrypoint.sh: running /docker-entrypoint-initdb.d/init-script.sh /usr/local/bin/docker-entrypoint.sh: line 174: /docker-entrypoint-initdb.d/init-script.sh: cannot execute: required file not found

yshyn-iohk commented 3 weeks ago

Hi, @incubiq. There is one thing that I can help with.

The file init-script.sh should be removed as the corresponding logic was moved to the agent, and the script itself is not idempotent. I will make a corresponding PR for this.

incubiq commented 3 weeks ago

Hi @yshyn-iohk - thanks. At the moment, I have the version 1.33.0 working properly (not tested all APIs yet, but so far so good).

I think there are 2 problems left, although with a lower priority:

1/ I would think the docker instance should not be "healthy" if there is a FATAL in the program running in the image

2/ I have still not managed to make the older versions (v1.28.0, or v1.31.0) work, but since the recent one works, it is good enough for me.

I let you review if you wish to close the ticket or keep open for at least one of those problems. Thank you