supabase / cli

Supabase CLI. Manage postgres migrations, run Supabase locally, deploy edge functions. Postgres backups. Generating types from your database schema.
https://supabase.com/docs/reference/cli/about
MIT License
1.02k stars 199 forks source link

bug: connection reset by peer when `supabase diff` #1324

Closed cutecutecat closed 1 year ago

cutecutecat commented 1 year ago

Describe the bug This bug is hard to reproduce.

Running supabase diff will start a postgres instance as shadow database, and Dial to it then.

However, postgres image will shutdown first at run, and then start, as https://stackoverflow.com/a/37437238 says.

At some machines, supabase cli's Dial will be sent before databse service available, and lead to a connection reset.

To Reproduce This bug could only be reproduced on a small number of devices!

  1. Run supabase diff ...
  2. Crash with connection reset

Expected behavior Command runs gracefully, and output diff schema.

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

Additional context

Log of supabase cli:

Supabase CLI 1.78.0
Connecting to local database...
Creating shadow database...
2023/07/21 15:43:31 Sent Header: Host [docker]
2023/07/21 15:43:31 Sent Header: User-Agent [Docker-Client/unknown-version (linux)]
2023/07/21 15:43:31 Send Done
2023/07/21 15:43:31 Recv First Byte
2023/07/21 15:43:31 Sent Header: Host [docker]
2023/07/21 15:43:31 Sent Header: User-Agent [Docker-Client/unknown-version (linux)]
2023/07/21 15:43:31 Sent Header: Content-Length [312]
2023/07/21 15:43:31 Sent Header: Content-Type [application/json]
2023/07/21 15:43:31 Send Done
2023/07/21 15:43:31 Recv First Byte
2023/07/21 15:43:31 Sent Header: Host [docker]
2023/07/21 15:43:31 Sent Header: User-Agent [Docker-Client/unknown-version (linux)]
2023/07/21 15:43:31 Sent Header: Content-Length [11327]
2023/07/21 15:43:31 Sent Header: Content-Type [application/json]
2023/07/21 15:43:31 Send Done
2023/07/21 15:43:31 Recv First Byte
2023/07/21 15:43:31 Sent Header: Host [docker]
2023/07/21 15:43:31 Sent Header: User-Agent [Docker-Client/unknown-version (linux)]
2023/07/21 15:43:31 Sent Header: Content-Length [0]
2023/07/21 15:43:31 Sent Header: Content-Type [text/plain]
2023/07/21 15:43:31 Send Done
2023/07/21 15:43:33 Recv First Byte
2023/07/21 15:43:33 Connect Start: tcp 127.0.0.1:54320
2023/07/21 15:43:33 Connect Done: tcp 127.0.0.1:54320
2023/07/21 15:43:33 PG Send: {"Type":"StartupMessage","ProtocolVersion":196608,"Parameters":{"database":"postgres","user":"postgres"}}
panic: read tcp 127.0.0.1:41200->127.0.0.1:54320: read: connection reset by peer

goroutine 24 [running]:
github.com/supabase/cli/internal/debug.(*Proxy).DialFunc.func1()
        github.com/supabase/cli/internal/debug/postgres.go:63 +0x445
created by github.com/supabase/cli/internal/debug.(*Proxy).DialFunc
        github.com/supabase/cli/internal/debug/postgres.go:46 +0x152

Log of shadow postgres container:

/usr/local/bin/docker-entrypoint.sh: ignoring /docker-entrypoint-initdb.d/migrations

 2023-07-21 07:43:43.471 UTC [57] LOG:  received fast shutdown request
waiting for server to shut down.... 2023-07-21 07:43:43.496 UTC [57] LOG:  aborting any active transactions
 2023-07-21 07:43:43.496 UTC [77] LOG:  pg_cron scheduler shutting down
 2023-07-21 07:43:43.497 UTC [75] FATAL:  terminating background worker "TimescaleDB Background Worker Launcher" due to administrator command
 2023-07-21 07:43:43.500 UTC [57] LOG:  background worker "TimescaleDB Background Worker Launcher" (PID 75) exited with exit code 1
 2023-07-21 07:43:43.500 UTC [57] LOG:  background worker "logical replication launcher" (PID 78) exited with exit code 1
 2023-07-21 07:43:43.501 UTC [70] LOG:  shutting down
 2023-07-21 07:43:43.608 UTC [70] LOG:  checkpoint starting: shutdown immediate
 2023-07-21 07:43:44.385 UTC [70] LOG:  checkpoint complete: wrote 356 buffers (2.2%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.054 s, sync=0.556 s, total=0.884 s; sync files=123, longest=0.313 s, average=0.005 s; distance=2945 kB, estimate=2945 kB
. 2023-07-21 07:43:44.505 UTC [57] LOG:  database system is shut down
 done
server stopped

PostgreSQL init process complete; ready for start up.

 2023-07-21 07:43:44.614 UTC [14] LOG:  pgaudit extension initialized
 2023-07-21 07:43:44.621 UTC [14] LOG:  pgsodium primary server secret key loaded
 2023-07-21 07:43:44.747 UTC [14] LOG:  starting PostgreSQL 15.1 (Ubuntu 15.1-1.pgdg20.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0, 64-bit
 2023-07-21 07:43:44.748 UTC [14] LOG:  listening on IPv4 address "0.0.0.0", port 5432
 2023-07-21 07:43:44.748 UTC [14] LOG:  listening on IPv6 address "::", port 5432
 2023-07-21 07:43:44.814 UTC [14] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
 2023-07-21 07:43:44.899 UTC [184] LOG:  database system was shut down at 2023-07-21 07:43:44 UTC
 2023-07-21 07:43:44.942 UTC [14] LOG:  database system is ready to accept connections
 2023-07-21 07:43:44.943 UTC [187] LOG:  TimescaleDB background worker launcher connected to shared catalogs
 2023-07-21 07:43:44.958 UTC [189] LOG:  pg_cron scheduler started

We can see service is available at 07:43:44.942 and crash happens at 15:43:33. P.S. The difference of hour display is caused by UTC+8, and could be neglect.

How to fix Maybe some retry-on-failed or service probe technique will help.

cutecutecat commented 1 year ago

Possibly related to #665

unaimillan commented 1 year ago

I am having the same issue. I'm running npx supabase start and then npx supabase db diff inside the Github Codespace (4-core, 16GB RAM, 32GB)

Console output:

@unaimillan ➜ /workspaces/ProStart (dev) $ npx supabase db diff 
Connecting to local database...
Creating shadow database...
failed to connect to `host=localhost user=postgres database=postgres`: failed to receive message (read tcp [::1]:60690->[::1]:54320: read: connection reset by peer)
Try rerunning the command with --debug to troubleshoot the error.

The following attempts to run the diff result in following:

@unaimillan ➜ /workspaces/ProStart (dev) $ npx supabase db diff --debug
Supabase CLI 1.83.7
Connecting to local database...
2023/08/08 15:20:50 Connect Start: tcp 127.0.0.1:54322
2023/08/08 15:20:50 Connect Done: tcp 127.0.0.1:54322
other logs

``` 2023/08/08 15:20:50 PG Send: {"Type":"StartupMessage","ProtocolVersion":196608,"Parameters":{"database":"postgres","user":"postgres"}} 2023/08/08 15:20:50 PG Recv: {"Type":"AuthenticationSASL","AuthMechanisms":["SCRAM-SHA-256"]} 2023/08/08 15:20:50 PG Send: {"Type":"SASLInitialResponse","AuthMechanism":"SCRAM-SHA-256","Data":"n,,n=,r=cqLkD34PlEqLY9G5Gz/jwEZa"} 2023/08/08 15:20:50 PG Recv: {"Type":"AuthenticationSASLContinue","Data":"r=cqLkD34PlEqLY9G5Gz/jwEZaR2eEPYnp3ZPe5uNwwBGHnFHf,s=xcSazfYEPnP9sIlODnoh9w==,i=4096"} 2023/08/08 15:20:50 PG Send: {"Type":"SASLResponse","Data":"c=biws,r=cqLkD34PlEqLY9G5Gz/jwEZaR2eEPYnp3ZPe5uNwwBGHnFHf,p=wh5Gh0VNcAGJCkqxC49YKFFeRYyI6QRwDZ/tMeVQe6s="} 2023/08/08 15:20:50 PG Recv: {"Type":"AuthenticationSASLFinal","Data":"v=UP4XzPaA8bds5A5FF35KAfE/Qc8YpYq3/o5GjWZVavY="} 2023/08/08 15:20:50 PG Recv: {"Type":"AuthenticationOK"} 2023/08/08 15:20:50 PG Recv: {"Type":"ParameterStatus","Name":"application_name","Value":""} 2023/08/08 15:20:50 PG Recv: {"Type":"ParameterStatus","Name":"client_encoding","Value":"UTF8"} 2023/08/08 15:20:50 PG Recv: {"Type":"ParameterStatus","Name":"DateStyle","Value":"ISO, MDY"} 2023/08/08 15:20:50 PG Recv: {"Type":"ParameterStatus","Name":"default_transaction_read_only","Value":"off"} 2023/08/08 15:20:50 PG Recv: {"Type":"ParameterStatus","Name":"in_hot_standby","Value":"off"} 2023/08/08 15:20:50 PG Recv: {"Type":"ParameterStatus","Name":"integer_datetimes","Value":"on"} 2023/08/08 15:20:50 PG Recv: {"Type":"ParameterStatus","Name":"IntervalStyle","Value":"postgres"} 2023/08/08 15:20:50 PG Recv: {"Type":"ParameterStatus","Name":"is_superuser","Value":"off"} 2023/08/08 15:20:50 PG Recv: {"Type":"ParameterStatus","Name":"server_encoding","Value":"UTF8"} 2023/08/08 15:20:50 PG Recv: {"Type":"ParameterStatus","Name":"server_version","Value":"15.1 (Ubuntu 15.1-1.pgdg20.04+1)"} 2023/08/08 15:20:50 PG Recv: {"Type":"ParameterStatus","Name":"session_authorization","Value":"postgres"} 2023/08/08 15:20:50 PG Recv: {"Type":"ParameterStatus","Name":"standard_conforming_strings","Value":"on"} 2023/08/08 15:20:50 PG Recv: {"Type":"ParameterStatus","Name":"TimeZone","Value":"UTC"} 2023/08/08 15:20:50 PG Recv: {"Type":"BackendKeyData","ProcessID":11316,"SecretKey":3299768426} 2023/08/08 15:20:50 PG Recv: {"Type":"ReadyForQuery","TxStatus":"I"} 2023/08/08 15:20:50 PG Send: {"Type":"Parse","Name":"lrupsc_1_0","Query":"SELECT schema_name FROM information_schema.schemata WHERE NOT schema_name LIKE ANY($1) ORDER BY schema_name","ParameterOIDs":null} 2023/08/08 15:20:50 PG Send: {"Type":"Describe","ObjectType":"S","Name":"lrupsc_1_0"} 2023/08/08 15:20:50 PG Send: {"Type":"Sync"} 2023/08/08 15:20:50 PG Recv: {"Type":"ParseComplete"} 2023/08/08 15:20:50 PG Recv: {"Type":"ParameterDescription","ParameterOIDs":[1009]} 2023/08/08 15:20:50 PG Recv: {"Type":"RowDescription","Fields":[{"Name":"schema_name","TableOID":13330,"TableAttributeNumber":2,"DataTypeOID":19,"DataTypeSize":64,"TypeModifier":-1,"Format":0}]} 2023/08/08 15:20:50 PG Recv: {"Type":"ReadyForQuery","TxStatus":"I"} 2023/08/08 15:20:50 PG Send: {"Type":"Bind","DestinationPortal":"","PreparedStatement":"lrupsc_1_0","ParameterFormatCodes":[1],"Parameters":[{"binary":"00000001000000000000001900000018000000010000000461757468000000097067626f756e636572000000087265616c74696d650000000a5c5f7265616c74696d650000000773746f726167650000000b5c5f616e616c79746963730000001373757061626173655c5f66756e6374696f6e730000001473757061626173655c5f6d6967726174696f6e7300000013696e666f726d6174696f6e5c5f736368656d610000000570675c5f250000000463726f6e000000076772617068716c0000000f6772617068716c5c5f7075626c6963000000036e6574000000087067736f6469756d0000000f7067736f6469756d5c5f6d61736b73000000057067746c650000000672657061636b0000000574696765720000000b74696765725c5f646174610000000e74696d657363616c6564625c5f25000000105c5f74696d657363616c6564625c5f2500000008746f706f6c6f6779000000057661756c74"}],"ResultFormatCodes":[1]} 2023/08/08 15:20:50 PG Send: {"Type":"Describe","ObjectType":"P","Name":""} 2023/08/08 15:20:50 PG Send: {"Type":"Execute","Portal":"","MaxRows":0} 2023/08/08 15:20:50 PG Send: {"Type":"Sync"} 2023/08/08 15:20:50 PG Recv: {"Type":"BindComplete"} 2023/08/08 15:20:50 PG Recv: {"Type":"RowDescription","Fields":[{"Name":"schema_name","TableOID":13330,"TableAttributeNumber":2,"DataTypeOID":19,"DataTypeSize":64,"TypeModifier":-1,"Format":1}]} 2023/08/08 15:20:50 PG Recv: {"Type":"DataRow","Values":[{"text":"extensions"}]} 2023/08/08 15:20:50 PG Recv: {"Type":"DataRow","Values":[{"text":"public"}]} 2023/08/08 15:20:50 PG Recv: {"Type":"CommandComplete","CommandTag":"SELECT 2"} 2023/08/08 15:20:50 PG Recv: {"Type":"ReadyForQuery","TxStatus":"I"} Creating shadow database... 2023/08/08 15:20:50 Sent Header: Host [api.moby.localhost] 2023/08/08 15:20:50 Sent Header: User-Agent [Docker-Client/unknown-version (linux)] 2023/08/08 15:20:50 Sent Header: X-Meta-Source-Client [github/codespaces] 2023/08/08 15:20:50 Send Done 2023/08/08 15:20:50 PG Send: {"Type":"Terminate"} 2023/08/08 15:20:50 Recv First Byte 2023/08/08 15:20:50 Sent Header: Host [api.moby.localhost] 2023/08/08 15:20:50 Sent Header: User-Agent [Docker-Client/unknown-version (linux)] 2023/08/08 15:20:50 Sent Header: Content-Length [306] 2023/08/08 15:20:50 Sent Header: Content-Type [application/json] 2023/08/08 15:20:50 Sent Header: X-Meta-Source-Client [github/codespaces] 2023/08/08 15:20:50 Send Done 2023/08/08 15:20:50 Recv First Byte 2023/08/08 15:20:50 Sent Header: Host [api.moby.localhost] 2023/08/08 15:20:50 Sent Header: User-Agent [Docker-Client/unknown-version (linux)] 2023/08/08 15:20:50 Sent Header: Content-Length [11322] 2023/08/08 15:20:50 Sent Header: Content-Type [application/json] 2023/08/08 15:20:50 Sent Header: X-Meta-Source-Client [github/codespaces] 2023/08/08 15:20:50 Send Done 2023/08/08 15:20:51 Recv First Byte ```

2023/08/08 15:20:51 Sent Header: Host [api.moby.localhost]
2023/08/08 15:20:51 Sent Header: User-Agent [Docker-Client/unknown-version (linux)]
2023/08/08 15:20:51 Sent Header: Content-Length [0]
2023/08/08 15:20:51 Sent Header: Content-Type [text/plain]
2023/08/08 15:20:51 Sent Header: X-Meta-Source-Client [github/codespaces]
2023/08/08 15:20:51 Send Done
2023/08/08 15:20:51 Recv First Byte
Error response from daemon: driver failed programming external connectivity on endpoint pensive_lewin (f98839c84ac52a2f15c6481cc5a6bb29f8d64d73cfdb0b4ef0bd2efb6674c114): Bind for 0.0.0.0:54320 failed: port is already allocated
unaimillan commented 1 year ago

@cutecutecat Have you managed to solve the issue?

unaimillan commented 1 year ago

Also small question or proposal (probably to @sweatybridge if related): Is it possible and makes any sense to name the Shadow Database container in the same fashion as all other containers like supabase_shadowdb_<folder>?

unaimillan commented 1 year ago

As a temporary workaround for my gh codespace, I have done the following:

  1. Installed the migra tool into the VM with pip install psycopg2-binary and pip install migra
  2. Recreated the shadow database with docker run --rm -it --name supa_shadow -e POSTGRES_PASSWORD=postgres -p 54320:5432 supabase/postgres:15.1.0.103 (in the separate terminal)
  3. Applied all migrations to the shadow database with cat supabase/migrations/* | docker exec -i supa_shadow psql -U postgres
  4. Run the migra tool to get the migration plan with migra --schema public postgresql://postgres:postgres@localhost:54320/postgres postgresql://postgres:postgres@localhost:54322/postgres
xieydd commented 1 year ago

@unaimillan Awesome, it works.

sweatybridge commented 1 year ago

I can reproduce this error in GitHub Codespaces. I will fix it in the next release by waiting for shadow database health check.

Is it possible and makes any sense to name the Shadow Database container in the same fashion as all other containers like supabaseshadowdb?

It is possible to use a predictable name but it means users will lose the ability to run multiple db diff commands at the same time to check against different remote in CI. I would avoid breaking this use case if possible.