PostgresApp / PostgresApp

The easiest way to get started with PostgreSQL on the Mac
https://postgresapp.com
Other
7.31k stars 380 forks source link

postgres 13.x keeps crashing #616

Closed jgehw closed 3 years ago

jgehw commented 3 years ago

Approximately once a day, postgres keeps crashing on my machine (macOS BigSur 11.2.1). I was hoping that the recent update to 13.2 would fix this, but it didn't. So, it's time to file a bug report.

The most valuable information I could find, was this line in postgresql.log:

2021-02-21 18:07:37.065 CET [15808] PANIC:  could not open file "pg_wal/00000001000002220000004D": Interrupted system call

which was followed by the inevitable consequences (I omit tons of repeated messages):

2021-02-21 18:07:37.066 CET [15804] LOG:  terminating any other active server processes
2021-02-21 18:07:37.066 CET [15897] WARNING:  terminating connection because of crash of another server process
2021-02-21 18:07:37.066 CET [15897] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2021-02-21 18:07:37.066 CET [15897] HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2021-02-21 18:07:44.133 CET [16371] FATAL:  the database system is in recovery mode
2021-02-21 18:07:45.066 CET [15804] LOG:  all server processes terminated; reinitializing
2021-02-21 18:07:45.080 CET [16372] LOG:  database system was interrupted; last known up at 2021-02-21 18:06:00 CET
2021-02-21 18:07:49.336 CET [16372] LOG:  database system was not properly shut down; automatic recovery in progress
2021-02-21 18:07:49.338 CET [16372] LOG:  redo starts at 222/B09A768
2021-02-21 18:07:54.331 CET [16372] LOG:  redo done at 222/4D003530
2021-02-21 18:08:28.879 CET [15804] LOG:  database system is ready to accept connections

First I was hoping that one of the usual suspects would be causing the issue (e.g. Antivirus or Time Machine backup locking a file). To eliminate these causes I added excludes for the postgres database folder. Unfortunately, postgres kept on crashing, thus the root cause seems to be something else.

I'll attach the report found in the macOS crash reports list (I don't get a clue out of it, but maybe you do).

Questions:

Process:               postgres [15808]
Path:                  /Applications/Postgres.app/Contents/Versions/13/bin/postgres
Identifier:            postgres
Version:               0
Code Type:             X86-64 (Native)
Parent Process:        postgres [15804]
User ID:               501

Date/Time:             2021-02-21 18:07:37.067 +0100
OS Version:            macOS 11.2.1 (20D74)
Report Version:        12
Anonymous UUID:        <<censored>>

Sleep/Wake UUID:       <<censored>>

Time Awake Since Boot: 770000 seconds
Time Since Wake:       610000 seconds

System Integrity Protection: enabled

Crashed Thread:        0  Dispatch queue: com.apple.main-thread

Exception Type:        EXC_CRASH (SIGABRT)
Exception Codes:       0x0000000000000000, 0x0000000000000000
Exception Note:        EXC_CORPSE_NOTIFY

Application Specific Information:
crashed on child side of fork pre-exec

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   libsystem_kernel.dylib          0x00007fff20304462 __pthread_kill + 10
1   libsystem_pthread.dylib         0x00007fff20332610 pthread_kill + 263
2   libsystem_c.dylib               0x00007fff20285720 abort + 120
3   postgres                        0x000000010b826e18 errfinish + 1096
4   postgres                        0x000000010b235a88 XLogFileInit + 424
5   postgres                        0x000000010b2345ce XLogWrite + 526
6   postgres                        0x000000010b234f98 XLogBackgroundFlush + 792
7   postgres                        0x000000010b5a361e WalWriterMain + 734
8   postgres                        0x000000010b259576 AuxiliaryProcessMain + 1638
9   postgres                        0x000000010b59b5cc StartChildProcess + 268
10  postgres                        0x000000010b59ab57 reaper + 807
11  libsystem_platform.dylib        0x00007fff20376d7d _sigtramp + 29
12  ???                             000000000000000000 0 + 0
13  postgres                        0x000000010b599b09 PostmasterMain + 6025
14  postgres                        0x000000010b48f5b9 main + 761
15  libdyld.dylib                   0x00007fff2034d621 start + 1

Thread 0 crashed with X86 Thread State (64-bit):
  rax: 0x0000000000000000  rbx: 0x000000011735fe00  rcx: 0x00007ffee4aa8138  rdx: 0x0000000000000000
  rdi: 0x0000000000000303  rsi: 0x0000000000000006  rbp: 0x00007ffee4aa8160  rsp: 0x00007ffee4aa8138
   r8: 0x00000000000130a8   r9: 0x00007fff8896cda8  r10: 0x000000011735fe00  r11: 0x0000000000000246
  r12: 0x0000000000000303  r13: 0x0000000000000000  r14: 0x0000000000000006  r15: 0x0000000000000016
  rip: 0x00007fff20304462  rfl: 0x0000000000000246  cr2: 0x000000010b9a7ab0

Logical CPU:     0
Error Code:      0x02000148
Trap Number:     133

Thread 0 instruction stream not available.

Thread 0 last branch register state not available.

Binary Images:
       0x10b154000 -        0x10b9f0fff +postgres (0) <E28FBBAC-582F-383F-A95D-2AA601D051EC> /Applications/Postgres.app/Contents/Versions/13/bin/postgres
       0x10bb57000 -        0x10bce9fff +libxml2.2.dylib (0) <0A8F7D64-A95B-329E-A23B-8BDB434618DC> /Applications/Postgres.app/Contents/Versions/13/lib/libxml2.2.dylib
       0x10bd28000 -        0x10bdabfff +libssl.1.1.dylib (0) <F8F8CDCF-E9C2-3FF8-82BD-F40EABDB5C8F> /Applications/Postgres.app/Contents/Versions/13/lib/libssl.1.1.dylib
       0x10bde4000 -        0x10c00dfff +libcrypto.1.1.dylib (0) <1C689B23-802D-3A73-A9A7-B743C4EB8750> /Applications/Postgres.app/Contents/Versions/13/lib/libcrypto.1.1.dylib
       0x10c0b5000 -        0x10c293fff +libicui18n.67.dylib (0) <5A0EC7CA-D749-3FB1-AD98-ED4EA2124EAA> /Applications/Postgres.app/Contents/Versions/13/lib/libicui18n.67.dylib
       0x10c3a3000 -        0x10c4fafff +libicuuc.67.dylib (0) <3CFB4C05-3982-379E-B4DE-6A7FD82110F8> /Applications/Postgres.app/Contents/Versions/13/lib/libicuuc.67.dylib
       0x10c577000 -        0x10e08bfff +libicudata.67.dylib (0) <BB3DFBBF-B04D-349F-9AD7-633166C92E31> /Applications/Postgres.app/Contents/Versions/13/lib/libicudata.67.dylib
       0x117288000 -        0x117323fff  dyld (832.7.3) <0D4EA85F-7E30-338B-9215-314A5A5539B6> /usr/lib/dyld
    0x7fff20067000 -     0x7fff20068fff  libsystem_blocks.dylib (78) <E644CAA0-65B7-36E4-8041-520F3301F3DB> /usr/lib/system/libsystem_blocks.dylib
    0x7fff20069000 -     0x7fff2009efff  libxpc.dylib (2038.80.3) <70F26262-01AA-3CEC-9FAD-2701D24096F0> /usr/lib/system/libxpc.dylib
    0x7fff2009f000 -     0x7fff200b6fff  libsystem_trace.dylib (1277.80.2) <87FEF600-48D9-31C9-B8FC-D5249B2AE95D> /usr/lib/system/libsystem_trace.dylib
    0x7fff200b7000 -     0x7fff20156fff  libcorecrypto.dylib (1000.80.5) <1EB11CFB-ABD7-36DD-97C7-C112A6601416> /usr/lib/system/libcorecrypto.dylib
    0x7fff20157000 -     0x7fff20183fff  libsystem_malloc.dylib (317.40.8) <A498D1EF-E43D-310C-84E8-9C0AADA0C475> /usr/lib/system/libsystem_malloc.dylib
    0x7fff20184000 -     0x7fff201c8fff  libdispatch.dylib (1271.40.12) <AD988EEA-1A2F-3404-9A6E-390FC2504223> /usr/lib/system/libdispatch.dylib
    0x7fff201c9000 -     0x7fff20201fff  libobjc.A.dylib (818.2) <EB6B543F-D42C-3FB2-A2EC-43407C5F80D3> /usr/lib/libobjc.A.dylib
    0x7fff20202000 -     0x7fff20204fff  libsystem_featureflags.dylib (28.60.1) <9CECB43A-094E-3CA9-B730-24DEA1A6DE05> /usr/lib/system/libsystem_featureflags.dylib
    0x7fff20205000 -     0x7fff2028dfff  libsystem_c.dylib (1439.40.11) <4AF71812-4099-3E96-B271-1F259491A2B2> /usr/lib/system/libsystem_c.dylib
    0x7fff2028e000 -     0x7fff202e3fff  libc++.1.dylib (904.4) <B217D905-4F9C-3DE0-8844-88FAA3C2C851> /usr/lib/libc++.1.dylib
    0x7fff202e4000 -     0x7fff202fcfff  libc++abi.dylib (904.4) <3C9FE530-3CD2-3A64-8A36-70816AEBDF0D> /usr/lib/libc++abi.dylib
    0x7fff202fd000 -     0x7fff2032bfff  libsystem_kernel.dylib (7195.81.3) <AB413518-ECDE-3F04-A61C-278D3CF43076> /usr/lib/system/libsystem_kernel.dylib
    0x7fff2032c000 -     0x7fff20337fff  libsystem_pthread.dylib (454.80.2) <B989DF6C-ADFE-3AF9-9C91-07D2521F9E47> /usr/lib/system/libsystem_pthread.dylib
    0x7fff20338000 -     0x7fff20372fff  libdyld.dylib (832.7.3) <4641E48F-75B5-3CC7-8263-47BF79F15394> /usr/lib/system/libdyld.dylib
    0x7fff20373000 -     0x7fff2037cfff  libsystem_platform.dylib (254.80.2) <1C3E1A0A-92A8-3CDE-B622-8940B43A5DF2> /usr/lib/system/libsystem_platform.dylib
    0x7fff2037d000 -     0x7fff203a8fff  libsystem_info.dylib (542.40.3) <0C96CFE8-71F5-3335-8423-581BC3DE5846> /usr/lib/system/libsystem_info.dylib
    0x7fff22797000 -     0x7fff227a0fff  libsystem_darwin.dylib (1439.40.11) <E016D8F7-C87F-36F8-B8A0-6A61B8E4BACB> /usr/lib/system/libsystem_darwin.dylib
    0x7fff22bb2000 -     0x7fff22bbdfff  libsystem_notify.dylib (279.40.4) <B2BF20C7-448A-3FBD-A2F5-AB7618D173F6> /usr/lib/system/libsystem_notify.dylib
    0x7fff24b0e000 -     0x7fff24b1cfff  libsystem_networkextension.dylib (1295.80.3) <5213D866-7D0E-3FD9-8E1A-03C0E39CEC44> /usr/lib/system/libsystem_networkextension.dylib
    0x7fff24b7a000 -     0x7fff24b90fff  libsystem_asl.dylib (385) <5B48071E-85EB-33B0-AE9B-127AEB398AEC> /usr/lib/system/libsystem_asl.dylib
    0x7fff26279000 -     0x7fff26280fff  libsystem_symptoms.dylib (1431.40.36) <BC85B46C-02EE-31FF-861D-F0DE01E8F6CF> /usr/lib/system/libsystem_symptoms.dylib
    0x7fff282c9000 -     0x7fff282d9fff  libsystem_containermanager.dylib (318.80.2) <6F08275F-B912-3D8E-9D74-4845158AE4F3> /usr/lib/system/libsystem_containermanager.dylib
    0x7fff28fd6000 -     0x7fff28fd9fff  libsystem_configuration.dylib (1109.60.2) <4917D824-4DE8-32CC-9ED2-1FBF371FEB9F> /usr/lib/system/libsystem_configuration.dylib
    0x7fff28fda000 -     0x7fff28fdefff  libsystem_sandbox.dylib (1441.60.4) <5F7F3DD1-4B38-310C-AA8F-19FF1B0F5276> /usr/lib/system/libsystem_sandbox.dylib
    0x7fff29ce2000 -     0x7fff29ce4fff  libquarantine.dylib (119.40.2) <40D35D75-524B-3DA6-8159-E7E0FA66F5BC> /usr/lib/system/libquarantine.dylib
    0x7fff2a264000 -     0x7fff2a268fff  libsystem_coreservices.dylib (127) <529A0663-A936-309C-9318-1B04B7F70658> /usr/lib/system/libsystem_coreservices.dylib
    0x7fff2a46c000 -     0x7fff2a47efff  libz.1.dylib (76) <6E2BD7A3-DC55-3183-BBF7-3AC367BC1834> /usr/lib/libz.1.dylib
    0x7fff2a47f000 -     0x7fff2a4c6fff  libsystem_m.dylib (3186.40.2) <DD26CC5C-AFF6-305F-A567-14909DD57163> /usr/lib/system/libsystem_m.dylib
    0x7fff2a4c7000 -     0x7fff2a4c7fff  libcharset.1.dylib (59) <D14F9D24-693A-37F0-8F92-D260248EB282> /usr/lib/libcharset.1.dylib
    0x7fff2a4c8000 -     0x7fff2a4cdfff  libmacho.dylib (973.4) <C2584BC4-497B-3170-ADDF-21B8E10B4DFD> /usr/lib/system/libmacho.dylib
    0x7fff2a4ea000 -     0x7fff2a4f5fff  libcommonCrypto.dylib (60178.40.2) <822A29CE-BF54-35AD-BB15-8FAECB800C7D> /usr/lib/system/libcommonCrypto.dylib
    0x7fff2a4f6000 -     0x7fff2a500fff  libunwind.dylib (200.10) <1D0A4B4A-4370-3548-8DC1-42A7B4BD45D3> /usr/lib/system/libunwind.dylib
    0x7fff2a501000 -     0x7fff2a508fff  liboah.dylib (203.30) <44C477D9-013F-3A6D-A9FE-68A89214E6A5> /usr/lib/liboah.dylib
    0x7fff2a509000 -     0x7fff2a513fff  libcopyfile.dylib (173.40.2) <39DBE613-135B-3AFE-A6AF-7898A37F70C2> /usr/lib/system/libcopyfile.dylib
    0x7fff2a514000 -     0x7fff2a51bfff  libcompiler_rt.dylib (102.2) <62EE1D14-5ED7-3CEC-81C0-9C93833641F1> /usr/lib/system/libcompiler_rt.dylib
    0x7fff2a51c000 -     0x7fff2a51efff  libsystem_collections.dylib (1439.40.11) <C53D5E0C-0C4F-3B35-A24B-E0D7101A3F95> /usr/lib/system/libsystem_collections.dylib
    0x7fff2a51f000 -     0x7fff2a521fff  libsystem_secinit.dylib (87.60.1) <E05E35BC-1BAB-365B-8BEE-D774189EFD3B> /usr/lib/system/libsystem_secinit.dylib
    0x7fff2a522000 -     0x7fff2a524fff  libremovefile.dylib (49.40.3) <5CC12A16-82CB-32F0-9040-72CFC88A48DF> /usr/lib/system/libremovefile.dylib
    0x7fff2a525000 -     0x7fff2a525fff  libkeymgr.dylib (31) <803F6AED-99D5-3CCF-B0FA-361BCF14C8C0> /usr/lib/system/libkeymgr.dylib
    0x7fff2a526000 -     0x7fff2a52dfff  libsystem_dnssd.dylib (1310.80.1) <E0A0CAB3-6779-3C83-AC67-046CFE69F9C2> /usr/lib/system/libsystem_dnssd.dylib
    0x7fff2a52e000 -     0x7fff2a533fff  libcache.dylib (83) <1A98B064-8FED-39CF-BB2E-5BDA1EF5B65A> /usr/lib/system/libcache.dylib
    0x7fff2a534000 -     0x7fff2a535fff  libSystem.B.dylib (1292.60.1) <83503CE0-32B1-36DB-A4F0-3CC6B7BCF50A> /usr/lib/libSystem.B.dylib
    0x7fff2a573000 -     0x7fff2a663fff  libiconv.2.dylib (59) <AD10ECF4-E137-3152-9612-7EC548D919E8> /usr/lib/libiconv.2.dylib
    0x7fff2d96d000 -     0x7fff2d96dfff  liblaunch.dylib (2038.80.3) <C7C51322-8491-3B78-9CFA-2B4753662BCF> /usr/lib/system/liblaunch.dylib
    0x7fff2fe21000 -     0x7fff2fe21fff  libsystem_product_info_filter.dylib (8.40.1) <20310EE6-2C3F-361A-9ECA-4223AFC03B65> /usr/lib/system/libsystem_product_info_filter.dylib

External Modification Summary:
  Calls made by other processes targeting this process:
    task_for_pid: 4
    thread_create: 0
    thread_set_state: 0
  Calls made by this process:
    task_for_pid: 0
    thread_create: 0
    thread_set_state: 0
  Calls made by all processes on this machine:
    task_for_pid: 3009958
    thread_create: 0
    thread_set_state: 0

VM Region Summary:
ReadOnly portion of Libraries: Total=541.8M resident=0K(0%) swapped_out_or_unallocated=541.8M(100%)
Writable regions: Total=452.7M written=0K(0%) resident=0K(0%) swapped_out=0K(0%) unallocated=452.7M(100%)

                                VIRTUAL   REGION 
REGION TYPE                        SIZE    COUNT (non-coalesced) 
===========                     =======  ======= 
Kernel Alloc Once                    8K        1 
MALLOC                            62.1M       19 
MALLOC guard page                   24K        4 
MALLOC_MEDIUM (reserved)         240.0M        2         reserved VM address space (unallocated)
STACK GUARD                       56.0M        1 
Stack                             8192K        1 
VM_ALLOCATE                      142.2M        4 
__DATA                            1177K       57 
__DATA_CONST                       249K       36 
__DATA_DIRTY                        58K       21 
__LINKEDIT                       492.7M       10 
__OBJC_RO                         60.6M        1 
__OBJC_RW                         2449K        2 
__TEXT                            49.1M       53 
shared memory                       24K        3 
===========                     =======  ======= 
TOTAL                              1.1G      215 
TOTAL, minus reserved VM space   874.5M      215 
tbussmann commented 3 years ago

Thanks for your report. The issue is unfortunately already known and seem to be an upstream issue between a change Apple did in Big Sur, PostgreSQL and potentially EndpointSecurity extensions. See further information and links in #610

tbussmann commented 3 years ago

@jgehw We've attempted to patch the issue in PostgreSQL. Could you try if the following build of Postgres.app fixes the issue for you:

https://github.com/PostgresApp/PostgresApp/releases/tag/v2.5beta1

It would also be great if you could check the server log and see if you see any messages like this:

open file "..." failed: ....; retry

jgehw commented 3 years ago

@jgehw We've attempted to patch the issue in PostgreSQL. Could you try if the following build of Postgres.app fixes the issue for you:

Wow, many thanks for patching this while upstream is hesitating! I've been testing the beta build now for one day and didn't encounter the crash anymore.

It would also be great if you could check the server log and see if you see any messages like this:

open file "..." failed: ....; retry

Surprisingly, I also didn't find messages like this, and I'm wondering why. Maybe the issue has also been addressed elsewhere (I meanwhile upgraded macOS to 11.3 and IIRC my anti-virus also had an auto-update in the meantime)? Or is just logging not working as expected?