freswa / dovecot-xaps-daemon

MIT License
51 stars 11 forks source link

fatal error: concurrent map read and map write #26

Closed Fogelholk closed 1 year ago

Fogelholk commented 1 year ago

Once every couple of days I get the following error, which results in xapsd no longer responding. The stacktrace is completely spammed in the logs, since there are quite a few requests every second from dovecot nodes, until the service is restarted. Didn't know how much of the stacktrace is useful for you, but here's a bit of it at least.

I'm running on FreeBSD 13.1 with the tag f6d5733 checked out.

2023-04-27T11:15:32.398494+02:00 s972 xapsd[46491] time="2023-04-27T11:15:32+02:00" level=info msg="Apple returned 410 for notification to XXX / YYY"
2023-04-27T11:15:32.409280+02:00 s972 xapsd[46491] time="2023-04-27T11:15:32+02:00" level=info msg="Deleting YYY"
2023-04-27T11:15:32.409287+02:00 s972 xapsd[46491] fatal error: concurrent map read and map write
2023-04-27T11:15:32.412482+02:00 s972 xapsd[46491]
2023-04-27T11:15:32.412492+02:00 s972 xapsd[46491] goroutine 17661400 [running]:
2023-04-27T11:15:32.412497+02:00 s972 xapsd[46491] github.com/freswa/dovecot-xaps-daemon/internal/database.(*Database).UserExists(...)
2023-04-27T11:15:32.412502+02:00 s972 xapsd[46491]  github.com/freswa/dovecot-xaps-daemon/internal/database/database.go:286
2023-04-27T11:15:32.412516+02:00 s972 xapsd[46491] github.com/freswa/dovecot-xaps-daemon/internal.(*httpHandler).handleNotify(0xc006fb7de0, {0x8f8fd0, 0xc022a84a80}, 0xc022a90700, {0xc006fd8500?, 0x0?, 0xc00cc42e8e?})
2023-04-27T11:15:32.412523+02:00 s972 xapsd[46491]  github.com/freswa/dovecot-xaps-daemon/internal/socket.go:168 +0x77f
2023-04-27T11:15:32.412542+02:00 s972 xapsd[46491] github.com/julienschmidt/httprouter.(*Router).ServeHTTP(0xc006fa7f20, {0x8f8fd0, 0xc022a84a80}, 0xc022a90700)
2023-04-27T11:15:32.412547+02:00 s972 xapsd[46491]  github.com/julienschmidt/httprouter/router.go:387 +0x81c
2023-04-27T11:15:32.412559+02:00 s972 xapsd[46491] net/http.serverHandler.ServeHTTP({0x8f7d50?}, {0x8f8fd0, 0xc022a84a80}, 0xc022a90700)
2023-04-27T11:15:32.412564+02:00 s972 xapsd[46491]  net/http/server.go:2936 +0x316
2023-04-27T11:15:32.412567+02:00 s972 xapsd[46491] net/http.(*conn).serve(0xc01208b560, {0x8f91f8, 0xc006fdcd20})
2023-04-27T11:15:32.412570+02:00 s972 xapsd[46491]  net/http/server.go:1995 +0x612
2023-04-27T11:15:32.412575+02:00 s972 xapsd[46491] created by net/http.(*Server).Serve
2023-04-27T11:15:32.412583+02:00 s972 xapsd[46491]  net/http/server.go:3089 +0x5ed
2023-04-27T11:15:32.412589+02:00 s972 xapsd[46491]
2023-04-27T11:15:32.412592+02:00 s972 xapsd[46491] goroutine 1 [IO wait]:
2023-04-27T11:15:32.412595+02:00 s972 xapsd[46491] internal/poll.runtime_pollWait(0x82d253a98, 0x72)
2023-04-27T11:15:32.412598+02:00 s972 xapsd[46491]  runtime/netpoll.go:306 +0x89
2023-04-27T11:15:32.412603+02:00 s972 xapsd[46491] internal/poll.(*pollDesc).wait(0xc0000da800?, 0x4b906c?, 0x0)
2023-04-27T11:15:32.412610+02:00 s972 xapsd[46491]  internal/poll/fd_poll_runtime.go:84 +0x32
2023-04-27T11:15:32.412614+02:00 s972 xapsd[46491] internal/poll.(*pollDesc).waitRead(...)
2023-04-27T11:15:32.412618+02:00 s972 xapsd[46491]  internal/poll/fd_poll_runtime.go:89
2023-04-27T11:15:32.412622+02:00 s972 xapsd[46491] internal/poll.(*FD).Accept(0xc0000da800)
2023-04-27T11:15:32.412625+02:00 s972 xapsd[46491]  internal/poll/fd_unix.go:614 +0x2bd
2023-04-27T11:15:32.412632+02:00 s972 xapsd[46491] net.(*netFD).accept(0xc0000da800)
2023-04-27T11:15:32.412642+02:00 s972 xapsd[46491]  net/fd_unix.go:172 +0x35
2023-04-27T11:15:32.412645+02:00 s972 xapsd[46491] net.(*TCPListener).accept(0xc006fccb10)
2023-04-27T11:15:32.412648+02:00 s972 xapsd[46491]  net/tcpsock_posix.go:148 +0x25
2023-04-27T11:15:32.412652+02:00 s972 xapsd[46491] net.(*TCPListener).Accept(0xc006fccb10)
2023-04-27T11:15:32.412655+02:00 s972 xapsd[46491]  net/tcpsock.go:297 +0x3d
2023-04-27T11:15:32.412665+02:00 s972 xapsd[46491] net/http.(*Server).Serve(0xc006fec000, {0x8f8df0, 0xc006fccb10})
2023-04-27T11:15:32.412669+02:00 s972 xapsd[46491]  net/http/server.go:3059 +0x385
2023-04-27T11:15:32.412672+02:00 s972 xapsd[46491] net/http.(*Server).ListenAndServe(0xc006fec000)
2023-04-27T11:15:32.412676+02:00 s972 xapsd[46491]  net/http/server.go:2988 +0x7d
2023-04-27T11:15:32.412685+02:00 s972 xapsd[46491] net/http.ListenAndServe(...)
2023-04-27T11:15:32.412688+02:00 s972 xapsd[46491]  net/http/server.go:3242
2023-04-27T11:15:32.412692+02:00 s972 xapsd[46491] github.com/freswa/dovecot-xaps-daemon/internal.NewHttpSocket(0xc000000180, 0xc001d8e060, 0xc0000a63c0)
2023-04-27T11:15:32.412696+02:00 s972 xapsd[46491]  github.com/freswa/dovecot-xaps-daemon/internal/socket.go:52 +0x285
2023-04-27T11:15:32.412700+02:00 s972 xapsd[46491] main.main()
2023-04-27T11:15:32.412704+02:00 s972 xapsd[46491]  github.com/freswa/dovecot-xaps-daemon/cmd/xapsd/xapsd.go:68 +0x2c9
2023-04-27T11:15:32.412708+02:00 s972 xapsd[46491]
2023-04-27T11:15:32.412712+02:00 s972 xapsd[46491] goroutine 50 [chan receive, 136 minutes]:
2023-04-27T11:15:32.412715+02:00 s972 xapsd[46491] github.com/freswa/dovecot-xaps-daemon/internal/database.NewDatabase.func1()
2023-04-27T11:15:32.412718+02:00 s972 xapsd[46491]  github.com/freswa/dovecot-xaps-daemon/internal/database/database.go:105 +0x65
2023-04-27T11:15:32.412725+02:00 s972 xapsd[46491] created by github.com/freswa/dovecot-xaps-daemon/internal/database.NewDatabase
2023-04-27T11:15:32.412728+02:00 s972 xapsd[46491]  github.com/freswa/dovecot-xaps-daemon/internal/database/database.go:104 +0x24d
2023-04-27T11:15:32.412731+02:00 s972 xapsd[46491]
2023-04-27T11:15:32.412734+02:00 s972 xapsd[46491] goroutine 51 [select]:
2023-04-27T11:15:32.412737+02:00 s972 xapsd[46491] golang.org/x/net/http2.(*ClientConn).RoundTrip(0xc003838180, 0xc022a4f500)
2023-04-27T11:15:32.412740+02:00 s972 xapsd[46491]  golang.org/x/net/http2/transport.go:1269 +0x491
2023-04-27T11:15:32.412743+02:00 s972 xapsd[46491] golang.org/x/net/http2.(*Transport).RoundTripOpt(0xc0037a8750, 0xc022a4f500, {0xe0?})
2023-04-27T11:15:32.412745+02:00 s972 xapsd[46491]  golang.org/x/net/http2/transport.go:561 +0x1c5
2023-04-27T11:15:32.412749+02:00 s972 xapsd[46491] golang.org/x/net/http2.(*Transport).RoundTrip(0xc022a4f500?, 0x8f5ac0?)
2023-04-27T11:15:32.412753+02:00 s972 xapsd[46491]  golang.org/x/net/http2/transport.go:513 +0x1b
2023-04-27T11:15:32.412756+02:00 s972 xapsd[46491] net/http.send(0xc022a4f400, {0x8f5ac0, 0xc0037a8750}, {0x8?, 0x8424a0?, 0xb570e0?})
2023-04-27T11:15:32.412758+02:00 s972 xapsd[46491]  net/http/client.go:252 +0x5f7
2023-04-27T11:15:32.412769+02:00 s972 xapsd[46491] net/http.(*Client).send(0xc006adb770, 0xc022a4f400, {0x100000044fed9?, 0x829178ec8?, 0xb570e0?})
2023-04-27T11:15:32.412773+02:00 s972 xapsd[46491]  net/http/client.go:176 +0x9b
2023-04-27T11:15:32.412780+02:00 s972 xapsd[46491] net/http.(*Client).do(0xc006adb770, 0xc022a4f400)
2023-04-27T11:15:32.412785+02:00 s972 xapsd[46491]  net/http/client.go:716 +0x8fb
2023-04-27T11:15:32.412789+02:00 s972 xapsd[46491] net/http.(*Client).Do(...)
2023-04-27T11:15:32.412791+02:00 s972 xapsd[46491]  net/http/client.go:582
2023-04-27T11:15:32.412796+02:00 s972 xapsd[46491] github.com/sideshow/apns2.(*Client).PushWithContext(0xc000031ea0, {0x8290ee5d8?, 0xc00002c150}, 0xc001cec200)
2023-04-27T11:15:32.412805+02:00 s972 xapsd[46491]  github.com/sideshow/apns2/client.go:186 +0x1e5
2023-04-27T11:15:32.412808+02:00 s972 xapsd[46491] github.com/sideshow/apns2.(*Client).Push(...)
2023-04-27T11:15:32.412812+02:00 s972 xapsd[46491]  github.com/sideshow/apns2/client.go:156
2023-04-27T11:15:32.412815+02:00 s972 xapsd[46491] github.com/freswa/dovecot-xaps-daemon/internal.(*Apns).SendNotification(0xc0000a63c0, {{0xc00bce8080, 0x40}, {0xc007600cc0, 0x24}}, 0x0?)
2023-04-27T11:15:32.412819+02:00 s972 xapsd[46491]  github.com/freswa/dovecot-xaps-daemon/internal/apns.go:169 +0x55b
2023-04-27T11:15:32.412823+02:00 s972 xapsd[46491] github.com/freswa/dovecot-xaps-daemon/internal.(*Apns).checkDelayed(0xc0000a63c0)
2023-04-27T11:15:32.412827+02:00 s972 xapsd[46491]  github.com/freswa/dovecot-xaps-daemon/internal/apns.go:138 +0x41b
2023-04-27T11:15:32.412834+02:00 s972 xapsd[46491] github.com/freswa/dovecot-xaps-daemon/internal.(*Apns).createDelayedNotificationThread.func1()
2023-04-27T11:15:32.412838+02:00 s972 xapsd[46491]  github.com/freswa/dovecot-xaps-daemon/internal/apns.go:120 +0x53
2023-04-27T11:15:32.412842+02:00 s972 xapsd[46491] created by github.com/freswa/dovecot-xaps-daemon/internal.(*Apns).createDelayedNotificationThread
2023-04-27T11:15:32.412848+02:00 s972 xapsd[46491]  github.com/freswa/dovecot-xaps-daemon/internal/apns.go:118 +0x8d
Fogelholk commented 1 year ago

Thought I might add that maybe you wanna know dependency versions as well (should be same as this github repo however): https://cgit.freebsd.org/ports/tree/net/xapsd/Makefile

freswa commented 1 year ago

Could you check https://github.com/freswa/dovecot-xaps-daemon/pull/27?

Fogelholk commented 1 year ago

Thanks, testing it now! It could sometimes take a few days before it was crashing before, so I'll get back in a week or so if there are no crashes, or sooner if there are some crashes :)

freswa commented 1 year ago

Please use the latest version in that PR. The former one created deadlocks and I forgot to run the tests before pushing -.-

Fogelholk commented 1 year ago

Shit happends :)

Testing with 1945a9f instead!

freswa commented 1 year ago

Better use 97d97d306f83e6fcead5b6d02e98200541f63668 :disappointed:

Fogelholk commented 1 year ago

Testing with 97d97d3 ! ^^

Fogelholk commented 1 year ago

Looks like 97d97d3 is working fine for me, no crashes due to "concurrent map read and map write" so I think this one might be solved!

I did however just now get another crash with this message:

2023-05-04T23:22:30.986628+02:00 s972 xapsd[18564] time="2023-05-04T23:22:30+02:00" level=fatal msg="Error:Post \"https://api.push.apple.com/3/device/XXX\": http2: client connection lost"