jitsucom / jitsu

Jitsu is an open-source Segment alternative. Fully-scriptable data ingestion engine for modern data teams. Set-up a real-time data pipeline in minutes, not days
https://jitsu.com
MIT License
4.11k stars 292 forks source link

Console healthcheck fails when attached to Postgres versions >14 on RDS #1130

Closed johnpitchko closed 2 weeks ago

johnpitchko commented 1 month ago

Summary

https://jitsuhq.slack.com/archives/C018G6W0URG/p1726784323537209

Running Jitsu on AWS ECS attached to Postgres on RDS. When using a version >14, the console healthcheck fails on this line.

Confirmed that both PG instances use the same username and password and both are accessible via public internet.

Note that v14 of Postgres on RDS does not experience this issue. I also could not replicate using postgres:16 on my local Docker.

System configuration and versions

Using latest version of all Jitsu images.

Artifacts (logs, etc)

console log using PG 16 on RDS:

2024-09-20 06:40:51 FORCE_UPDATE_DB is set, updating database schema...
2024-09-20 06:40:52 Prisma schema loaded from schema.prisma
2024-09-20 06:40:52 Datasource "db": PostgreSQL database "postgres", schema "newjitsu" at "jitsu-staging-pg16.xxxxxx.us-east-1.rds.amazonaws.com:5432"
2024-09-20 06:40:56 
2024-09-20 06:40:56 The database is already in sync with the Prisma schema.
2024-09-20 06:40:56 
2024-09-20 06:40:56 Starting the app
2024-09-20 06:40:56 Waiting for localhost:3000 to be up...
2024-09-20 06:40:57 Running healthcheck...
2024-09-20 06:40:59 ❌ ❌ ❌ HEALTHCHECK FAILED - 503 from http://localhost:3000/api/healthcheck. Response:
2024-09-20 06:40:59 {"status":"error","prisma":{"status":"ok","ms":1585},"postgres":{"status":"error"}}
2024-09-20 06:40:59 Killing console with pid 58
2024-09-20 06:40:59 Killed

console log using PG 14 on RDS:

2024-09-20 06:38:48 FORCE_UPDATE_DB is set, updating database schema...
2024-09-20 06:38:50 Prisma schema loaded from schema.prisma
2024-09-20 06:38:50 Datasource "db": PostgreSQL database "postgres", schema "newjitsu" at "jitsu-staging-pg14.xxxxxx.us-east-1.rds.amazonaws.com:5432"
2024-09-20 06:38:53 
2024-09-20 06:38:53 The database is already in sync with the Prisma schema.
2024-09-20 06:38:54 
2024-09-20 06:38:54 Starting the app
2024-09-20 06:38:54 Waiting for localhost:3000 to be up...
2024-09-20 06:38:55 Running healthcheck...
2024-09-20 06:38:56 ⚡️⚡️⚡️ HEALTHCHECK PASSED - 200 from http://localhost:3000/api/healthcheck. Details:
2024-09-20 06:38:56 {"status":"ok","prisma":{"status":"ok","ms":1150},"postgres":{"status":"ok","ms":363}}
2024-09-20 06:38:56 Initializing console...
2024-09-20 06:38:57 {"success":true}
absorbb commented 1 month ago

@johnpitchko could you please try with jitsucom/console:2.8.2 image? it has logging fixed, so underlying error must become visible.

johnpitchko commented 1 month ago

Thanks for the extra debugging. Using that version of the image, here is the detailed error:

2024-09-21 00:20:34 FORCE_UPDATE_DB is set, updating database schema...
2024-09-21 00:20:35 Prisma schema loaded from schema.prisma
2024-09-21 00:20:35 Datasource "db": PostgreSQL database "postgres", schema "newjitsu" at "xxxxxx.us-east-1.rds.amazonaws.com:5432"
2024-09-21 00:20:39 
2024-09-21 00:20:39 The database is already in sync with the Prisma schema.
2024-09-21 00:20:39 
2024-09-21 00:20:39 Starting the app
2024-09-21 00:20:39 Waiting for localhost:3000 to be up...
2024-09-21 00:20:39   ▲ Next.js 14.2.5
2024-09-21 00:20:39   - Local:        http://[::1]:3000
2024-09-21 00:20:39   - Network:      http://[::]:3000
2024-09-21 00:20:39 
2024-09-21 00:20:39  ✓ Starting...
2024-09-21 00:20:40  ✓ Ready in 433ms
2024-09-21 00:20:40 Running healthcheck...
2024-09-21 00:20:40 2024-09-21 06:20:40.343Z INFO  [db]: Initializing prisma 
2024-09-21 00:20:40 2024-09-21 06:20:40.352Z INFO  [singleton]: ️⚡️⚡️⚡️ prisma connected in 9ms! 
2024-09-21 00:20:40 2024-09-21 06:20:40.354Z INFO  [singleton]: ️⚡️⚡️⚡️ pg connected in 1ms! 
2024-09-21 00:20:40 prisma:info Starting a postgresql pool with 17 connections.
2024-09-21 00:20:41 2024-09-21 06:20:41.655Z ERROR [healthcheck]: Service postgres failed to initialize error: no pg_hba.conf entry for host "70.77.xxx.xxx", user "postgres", database "postgres", no encryption

I then appended sslmode=no-verify to the DATABASE_URL (postgresql://postgres:${POSTGRES_PASSWORD}@${POSTGRES_HOSTNAME}:5432/postgres?schema=newjitsu&sslmode=no-verify) and was able to connect successfully:

2024-09-21 00:23:35 FORCE_UPDATE_DB is set, updating database schema...
2024-09-21 00:23:36 Prisma schema loaded from schema.prisma
2024-09-21 00:23:36 Datasource "db": PostgreSQL database "postgres", schema "newjitsu" at "xxxxxxx.us-east-1.rds.amazonaws.com:5432"
2024-09-21 00:23:40 
2024-09-21 00:23:40 The database is already in sync with the Prisma schema.
2024-09-21 00:23:40 
2024-09-21 00:23:40 Starting the app
2024-09-21 00:23:40 Waiting for localhost:3000 to be up...
2024-09-21 00:23:40   ▲ Next.js 14.2.5
2024-09-21 00:23:40   - Local:        http://[::1]:3000
2024-09-21 00:23:40   - Network:      http://[::]:3000
2024-09-21 00:23:40 
2024-09-21 00:23:40  ✓ Starting...
2024-09-21 00:23:41  ✓ Ready in 439ms
2024-09-21 00:23:41 Running healthcheck...
2024-09-21 00:23:41 2024-09-21 06:23:41.466Z INFO  [db]: Initializing prisma 
2024-09-21 00:23:41 2024-09-21 06:23:41.475Z INFO  [singleton]: ️⚡️⚡️⚡️ prisma connected in 9ms! 
2024-09-21 00:23:41 2024-09-21 06:23:41.477Z INFO  [singleton]: ️⚡️⚡️⚡️ pg connected in 2ms! 
2024-09-21 00:23:41 prisma:info Starting a postgresql pool with 17 connections.
2024-09-21 00:23:43 2024-09-21 06:23:43.188Z INFO  [db]: Connecting new client postgresql://postgres:xxxxxxx@xxxxx.us-east-1.rds.amazonaws.com:5432/postgres?schema=newjitsu&sslmode=no-verify. Pool stat: idle=0, waiting=0, total=2. Default schema: newjitsu 
2024-09-21 00:23:43 2024-09-21 06:23:43.602Z INFO  [db]: Connecting new client postgresql://postgres:xxxxxx@xxxxxx.us-east-1.rds.amazonaws.com:5432/postgres?schema=newjitsu&sslmode=no-verify. Pool stat: idle=1, waiting=0, total=2. Default schema: newjitsu 
2024-09-21 00:23:43 ⚡️⚡️⚡️ HEALTHCHECK PASSED - 200 from http://localhost:3000/api/healthcheck. Details:
2024-09-21 00:23:43 {"status":"ok","prisma":{"status":"ok","ms":1625},"postgres":{"status":"ok","ms":550}}
2024-09-21 00:23:43 Initializing console...
2024-09-21 00:23:43 2024-09-21 06:23:43.823Z INFO  [auth]: Using autogenerated JWT key e1c5ad0fe65af646fdbbc30d869993aa3aab6c7824c78be89d59abcd87681f9f
2024-09-21 00:23:43 2024-09-21 06:23:43.832Z INFO  [singleton]: ️⚡️⚡️⚡️ firebase-service connected in 0ms! 
2024-09-21 00:23:43 2024-09-21 06:23:43.854Z INFO  [events-log-init]: Init events log 
2024-09-21 00:23:43 2024-09-21 06:23:43.871Z INFO  [events-log-init]: Database newjitsu_metrics created or already exists 
2024-09-21 00:23:43 2024-09-21 06:23:43.882Z INFO  [events-log-init]: Table newjitsu_metrics.events_log created or already exists 
2024-09-21 00:23:43 {"success":true}
2024-09-21 00:23:43 Starting cron...

I'm no RDS expert, so their may be a misconfiguration in my RDS, but when I created it, I pretty much stuck with the standard options, aside from specifying the version and attaching a security group to allow access from my IP address.

absorbb commented 1 month ago

RDS has force_ssl set to true since version 15 by default. That is why sslmode=no-verify helps here.

johnpitchko commented 1 month ago

Yes that makes sense. I can open a PR to update the documentation and maybe add a comment in the template to explain this so future RDS users do not run into trouble. Thoughts?

github-actions[bot] commented 1 month ago

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] commented 2 weeks ago

This issue was closed because it has been inactive for 14 days since being marked as stale.