mikeizbicki / cmc-csci143

big data course materials
40 stars 77 forks source link

Can't bring down containers [did not receive an exit event] #543

Open ains-arch opened 1 month ago

ains-arch commented 1 month ago

Hi I'm having trouble bringing down the containers before closing the terminal window. I think I brought them down fine the last time I finished working, but now I'm getting errors. I think all I did this session was add indexes to the normalized database through the postgres cli.

docker-compose down
/home/aarchib/.local/lib/python3.6/site-packages/paramiko/transport.py:32: CryptographyDeprecationWarning: Python 3.6 is no longer supported by the Python core team. Therefore, support for it is deprecated in cryptography. The next release of cryptography will remove support for Python 3.6.
  from cryptography.hazmat.backends import default_backend
Stopping twitter_postgres_indexes_pg_denormalized_1     ... error
Stopping twitter_postgres_indexes_pg_normalized_batch_1 ... error

ERROR: for twitter_postgres_indexes_pg_denormalized_1  cannot stop container: 41ec3e52e352aa2381f54c02dac447be1b817d774b4cd946d6bec6104a15838a: tried to kill container, but did not receive an exit event

ERROR: for twitter_postgres_indexes_pg_normalized_batch_1  cannot stop container: 19c30bd13f377deda4dc2cc4483382a61709e5030e5a30712adba80d02d034a7: tried to kill container, but did not receive an exit event
Removing network twitter_postgres_indexes_default
ERROR: error while removing network: network twitter_postgres_indexes_default id 042398a87e1abb4bd9ed62268d33fffb66a3e2e1a4888d4041cd1d0933cd7c8d has active endpoints

After I ran that, I checked the containers ->

docker ps -a
CONTAINER ID   IMAGE                                          COMMAND                  CREATED       STATUS                       PORTS     NAMES
41ec3e52e352   twitter_postgres_indexes_pg_denormalized       "docker-entrypoint.s…"   3 hours ago   Exited (137) 2 minutes ago             twitter_postgres_indexes_pg_denormalized_1
19c30bd13f37   twitter_postgres_indexes_pg_normalized_batch   "docker-entrypoint.s…"   3 hours ago   Exited (137) 2 minutes ago             twitter_postgres_indexes_pg_normalized_batch_1

Can I close my laptop?

ains-arch commented 1 month ago

Here's some more information

denormalized

docker logs 41ec3e52e352

PostgreSQL Database directory appears to contain a database; Skipping initialization

2024-04-29 22:11:21.085 UTC [1] LOG:  starting PostgreSQL 13.14 (Debian 13.14-1.pgdg120+2) on x86_64-pc-                                                                                                           linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
2024-04-29 22:11:21.086 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2024-04-29 22:11:21.086 UTC [1] LOG:  listening on IPv6 address "::", port 5432
2024-04-29 22:11:23.909 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2024-04-29 22:15:06.188 UTC [27] LOG:  database system was interrupted; last known up at 2024-04-29 20:2                                                                                                           2:53 UTC
2024-04-29 22:19:23.048 UTC [27] LOG:  database system was not properly shut down; automatic recovery in                                                                                                            progress
2024-04-29 22:19:30.066 UTC [27] LOG:  redo starts at 10/E210B2B0
2024-04-29 22:19:30.066 UTC [27] LOG:  invalid record length at 10/E210B398: wanted 24, got 0
2024-04-29 22:19:30.066 UTC [27] LOG:  redo done at 10/E210B360
2024-04-29 22:19:32.615 UTC [1] LOG:  received fast shutdown request

PostgreSQL Database directory appears to contain a database; Skipping initialization

2024-04-29 22:25:07.250 UTC [1] LOG:  starting PostgreSQL 13.14 (Debian 13.14-1.pgdg120+2) on x86_64-pc-                                                                                                           linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
2024-04-29 22:25:07.250 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2024-04-29 22:25:07.250 UTC [1] LOG:  listening on IPv6 address "::", port 5432
2024-04-29 22:25:07.252 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2024-04-29 22:25:07.834 UTC [27] LOG:  database system was interrupted while in recovery at 2024-04-29 2                                                                                                           2:19:23 UTC
2024-04-29 22:25:07.834 UTC [27] HINT:  This probably means that some data is corrupted and you will hav                                                                                                           e to use the last backup for recovery.
2024-04-29 22:25:07.958 UTC [27] LOG:  database system was not properly shut down; automatic recovery in                                                                                                            progress
2024-04-29 22:25:07.960 UTC [27] LOG:  redo starts at 10/E210B2B0
2024-04-29 22:25:07.960 UTC [27] LOG:  invalid record length at 10/E210B398: wanted 24, got 0
2024-04-29 22:25:07.960 UTC [27] LOG:  redo done at 10/E210B360
2024-04-29 22:25:08.010 UTC [1] LOG:  database system is ready to accept connections
2024-04-30 00:11:48.855 UTC [31] LOG:  using stale statistics instead of current ones because stats coll                                                                                                           ector is not responding
2024-04-30 00:15:18.055 UTC [31] LOG:  using stale statistics instead of current ones because stats coll                                                                                                           ector is not responding
2024-04-30 00:57:52.462 UTC [1] LOG:  received fast shutdown request

normalized

docker logs 19c30bd13f37

PostgreSQL Database directory appears to contain a database; Skipping initialization

2024-04-29 22:11:21.080 UTC [1] LOG:  starting PostgreSQL 16.2 (Debian 16.2-1.pgdg110+2) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit
2024-04-29 22:11:21.081 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2024-04-29 22:11:21.081 UTC [1] LOG:  listening on IPv6 address "::", port 5432
2024-04-29 22:11:23.909 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2024-04-29 22:15:06.190 UTC [75] LOG:  database system was interrupted; last known up at 2024-04-29 21:00:28 UTC
2024-04-29 22:15:06.191 UTC [76] FATAL:  the database system is starting up
2024-04-29 22:15:06.191 UTC [76] LOG:  could not send data to client: Broken pipe
2024-04-29 22:15:06.191 UTC [77] FATAL:  the database system is starting up
2024-04-29 22:15:06.230 UTC [81] FATAL:  the database system is starting up
2024-04-29 22:15:06.292 UTC [85] FATAL:  the database system is starting up
2024-04-29 22:15:06.328 UTC [89] FATAL:  the database system is starting up
2024-04-29 22:15:06.364 UTC [93] FATAL:  the database system is starting up
2024-04-29 22:17:06.825 UTC [75] LOG:  syncing data directory (fsync), elapsed time: 120.63 s, current path: ./base/17502/2840_fsm
2024-04-29 22:17:30.832 UTC [75] LOG:  syncing data directory (fsync), elapsed time: 144.63 s, current path: ./base/1/2840_fsm
2024-04-29 22:17:51.447 UTC [75] LOG:  syncing data directory (fsync), elapsed time: 165.25 s, current path: ./base/5/2840_fsm
2024-04-29 22:18:15.495 UTC [75] LOG:  syncing data directory (fsync), elapsed time: 189.30 s, current path: ./base
2024-04-29 22:18:27.827 UTC [75] LOG:  syncing data directory (fsync), elapsed time: 201.63 s, current path: ./pg_replslot
2024-04-29 22:19:22.131 UTC [75] LOG:  syncing data directory (fsync), elapsed time: 255.93 s, current path: ./pg_notify
2024-04-29 22:19:22.769 UTC [75] LOG:  database system was not properly shut down; automatic recovery in progress
2024-04-29 22:19:30.194 UTC [75] LOG:  redo starts at 6/C8A5FE28
2024-04-29 22:19:32.616 UTC [1] LOG:  received fast shutdown request
2024-04-29 22:19:40.213 UTC [75] LOG:  redo in progress, elapsed time: 10.01 s, current LSN: 6/CC0B3BE8
2024-04-29 22:19:48.025 UTC [75] LOG:  invalid record length at 6/CE4F8FE8: expected at least 24, got 0
2024-04-29 22:19:48.025 UTC [75] LOG:  redo done at 6/CE4F8FB0 system usage: CPU: user: 0.06 s, system: 0.11 s, elapsed: 17.83 s
2024-04-29 22:21:06.662 UTC [75] LOG:  resetting unlogged relations (init), elapsed time: 78.63 s, current path: base/1

PostgreSQL Database directory appears to contain a database; Skipping initialization

2024-04-29 22:25:10.127 UTC [1] LOG:  starting PostgreSQL 16.2 (Debian 16.2-1.pgdg110+2) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit
2024-04-29 22:25:10.127 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2024-04-29 22:25:10.127 UTC [1] LOG:  listening on IPv6 address "::", port 5432
2024-04-29 22:25:10.128 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2024-04-29 22:25:10.240 UTC [29] LOG:  database system was interrupted while in recovery at 2024-04-29 22:19:23 UTC
2024-04-29 22:25:10.240 UTC [29] HINT:  This probably means that some data is corrupted and you will have to use the last backup for recovery.
2024-04-29 22:25:10.258 UTC [29] LOG:  database system was not properly shut down; automatic recovery in progress
2024-04-29 22:25:10.260 UTC [29] LOG:  redo starts at 6/C8A5FE28
2024-04-29 22:25:10.411 UTC [29] LOG:  invalid record length at 6/CE4F8FE8: expected at least 24, got 0
2024-04-29 22:25:10.411 UTC [29] LOG:  redo done at 6/CE4F8FB0 system usage: CPU: user: 0.06 s, system: 0.08 s, elapsed: 0.15 s
2024-04-29 22:25:11.011 UTC [27] LOG:  checkpoint starting: end-of-recovery immediate wait
2024-04-29 22:25:44.271 UTC [36] FATAL:  the database system is not yet accepting connections
2024-04-29 22:25:44.271 UTC [36] DETAIL:  Consistent recovery state has not been yet reached.
2024-04-29 22:25:59.210 UTC [27] LOG:  checkpoint complete: wrote 12755 buffers (77.9%); 0 WAL file(s) added, 6 removed, 0 recycled; write=4.183 s, sync=9.569 s, total=48.200 s; sync files=4, longest=9.569 s, average=2.393 s; distance=92772 kB, estimate=92772 kB; lsn=6/CE4F8FE8, redo lsn=6/CE4F8FE8
2024-04-29 22:26:01.943 UTC [1] LOG:  database system is ready to accept connections
2024-04-29 22:35:11.293 UTC [27] LOG:  checkpoint starting: time
2024-04-29 22:35:25.057 UTC [27] LOG:  checkpoint complete: wrote 14 buffers (0.1%); 0 WAL file(s) added, 7 removed, 0 recycled; write=1.883 s, sync=0.310 s, total=13.764 s; sync files=15, longest=0.308 s, average=0.021 s; distance=112841 kB, estimate=112841 kB; lsn=6/D532B558, redo lsn=6/D532B520
2024-04-29 22:45:11.197 UTC [27] LOG:  checkpoint starting: time
2024-04-29 22:45:32.459 UTC [27] LOG:  checkpoint complete: wrote 2 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.341 s, sync=12.350 s, total=21.263 s; sync files=2, longest=12.349 s, average=6.175 s; distance=8 kB, estimate=101558 kB; lsn=6/D532D748, redo lsn=6/D532D710
2024-04-29 22:55:11.951 UTC [27] LOG:  checkpoint starting: time
2024-04-29 23:00:25.696 UTC [27] LOG:  checkpoint complete: wrote 3701 buffers (22.6%); 0 WAL file(s) added, 0 removed, 0 recycled; write=215.141 s, sync=43.701 s, total=313.746 s; sync files=16, longest=43.701 s, average=2.732 s; distance=65 kB, estimate=91408 kB; lsn=6/D533DEE0, redo lsn=6/D533DE78
2024-04-29 23:17:47.784 UTC [27] LOG:  checkpoint starting: wal
2024-04-29 23:19:31.190 UTC [27] LOG:  checkpoint complete: wrote 5045 buffers (30.8%); 0 WAL file(s) added, 5 removed, 28 recycled; write=20.985 s, sync=35.836 s, total=103.406 s; sync files=3, longest=18.616 s, average=11.946 s; distance=541880 kB, estimate=541880 kB; lsn=7/15457280, redo lsn=6/F646C058
2024-04-29 23:19:31.191 UTC [27] LOG:  checkpoint starting: wal
2024-04-29 23:20:17.445 UTC [27] LOG:  checkpoint complete: wrote 0 buffers (0.0%); 0 WAL file(s) added, 23 removed, 33 recycled; write=25.053 s, sync=7.559 s, total=46.255 s; sync files=2, longest=5.505 s, average=3.779 s; distance=920068 kB, estimate=920068 kB; lsn=7/4C774438, redo lsn=7/2E6ED388
2024-04-29 23:44:13.337 UTC [27] LOG:  checkpoint starting: wal
2024-04-29 23:48:26.704 UTC [27] LOG:  checkpoint complete: wrote 7857 buffers (48.0%); 0 WAL file(s) added, 3 removed, 30 recycled; write=155.594 s, sync=24.867 s, total=253.367 s; sync files=17, longest=24.867 s, average=1.463 s; distance=537672 kB, estimate=881829 kB; lsn=7/6E403A38, redo lsn=7/4F3FF660
2024-04-29 23:48:26.704 UTC [27] LOG:  checkpoint starting: wal
2024-04-29 23:52:48.991 UTC [27] LOG:  checkpoint complete: wrote 14 buffers (0.1%); 0 WAL file(s) added, 0 removed, 34 recycled; write=2.364 s, sync=2.898 s, total=262.287 s; sync files=14, longest=1.911 s, average=0.207 s; distance=559642 kB, estimate=849610 kB; lsn=7/7EA9A518, redo lsn=7/71686228
2024-04-29 23:53:27.090 UTC [27] LOG:  checkpoint starting: time
2024-04-29 23:56:43.066 UTC [27] LOG:  checkpoint complete: wrote 1 buffers (0.0%); 0 WAL file(s) added, 0 removed, 13 recycled; write=39.166 s, sync=0.001 s, total=195.977 s; sync files=4, longest=0.001 s, average=0.001 s; distance=217174 kB, estimate=786366 kB; lsn=7/7EA9BAA0, redo lsn=7/7EA9BA68
2024-04-30 00:03:27.216 UTC [27] LOG:  checkpoint starting: time
2024-04-30 00:04:57.122 UTC [27] LOG:  checkpoint complete: wrote 14 buffers (0.1%); 0 WAL file(s) added, 0 removed, 4 recycled; write=1.103 s, sync=38.048 s, total=89.906 s; sync files=15, longest=16.171 s, average=2.537 s; distance=58492 kB, estimate=713579 kB; lsn=7/994665D0, redo lsn=7/823BACC0
2024-04-30 00:05:07.815 UTC [27] LOG:  checkpoint starting: wal
2024-04-30 00:05:07.828 UTC [27] LOG:  checkpoint complete: wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 33 recycled; write=0.002 s, sync=0.001 s, total=0.014 s; sync files=0, longest=0.000 s, average=0.000 s; distance=537479 kB, estimate=695969 kB; lsn=7/A30C4D38, redo lsn=7/A309CC30
2024-04-30 00:10:07.928 UTC [27] LOG:  checkpoint starting: time
2024-04-30 00:16:09.940 UTC [27] LOG:  checkpoint complete: wrote 2 buffers (0.0%); 0 WAL file(s) added, 0 removed, 17 recycled; write=83.961 s, sync=1.113 s, total=362.013 s; sync files=2, longest=1.113 s, average=0.557 s; distance=286336 kB, estimate=655006 kB; lsn=7/B483E3F0, redo lsn=7/B483CE58
2024-04-30 00:57:52.464 UTC [1] LOG:  received fast shutdown request
2024-04-30 00:59:13.597 UTC [38] WARNING:  autovacuum worker took too long to start; canceled

The tests were still passing the last time I ran them, even though bringing the containers up was suspect [took longer than it seems like it should have for the ports to become active], so I feel like.. the database isn't corrupted...? But I really don't know how to check that.

mikeizbicki commented 1 month ago

It shouldn't be necessary to ever stop the containers for the indexes assignment if you're not trying to delete the database. Also, the database should never corrupt if you're only running commands that interface with postgres directly due to postgres's ACID guarantees. (The only time I've ever seen a student's database get corrupted is when they've done the rm $PGDATA command and the command fails for some reason. This command is outside of postgres, and so postgres's ACID guarantees do not apply.)

But to answer your question, you can bring down all running containers using the commands

$ docker stop $(docker ps -qa)
$ docker rm $(docker ps -qa)

These commands can usually bring down contains even when docker-compose down is failing for some reason. If those commands also aren't working, then docker kill will be guaranteed to bring the container down.

ains-arch commented 1 month ago

So, if I need to close my terminal windows because I need to pause working but not delete my data, I shouldn't run docker-compose down first? I've heard some people are doing that and I have been and it doesn't seem like it's broken anything yet but I don't know. I guess I'm unclear on what the procedure should be when starting and stopping work in multiple sessions.