hslatman / caddy-crowdsec-bouncer

A Caddy module that blocks malicious traffic based on decisions made by CrowdSec.
139 stars 4 forks source link

Caddy Restarting: invalid memory address or nil pointer dereference #14

Closed tomcrawf90 closed 7 months ago

tomcrawf90 commented 2 years ago

Hi, over the last few weeks I've noticed short interruptions in connectivity when using application running on a Caddy server. After digging into the logs to figure out the issue I've come across the following. This appears to be causing Caddy to restart.

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x19b1923]

goroutine 2211 [running]:
github.com/hslatman/caddy-crowdsec-bouncer/internal/bouncer.(*crowdSecStore).delete(0x0, 0xc002a8dfa0?)
  github.com/hslatman/caddy-crowdsec-bouncer@v0.2.0/internal/bouncer/store.go:77 +0xa3
github.com/hslatman/caddy-crowdsec-bouncer/internal/bouncer.(*Bouncer).delete(...)
  github.com/hslatman/caddy-crowdsec-bouncer@v0.2.0/internal/bouncer/bouncer.go:152
github.com/hslatman/caddy-crowdsec-bouncer/internal/bouncer.(*Bouncer).Run.func1()
  github.com/hslatman/caddy-crowdsec-bouncer@v0.2.0/internal/bouncer/bouncer.go:95 +0x228
created by github.com/hslatman/caddy-crowdsec-bouncer/internal/bouncer.(*Bouncer).Run
  github.com/hslatman/caddy-crowdsec-bouncer@v0.2.0/internal/bouncer/bouncer.go:86 +0x6a

I'm running Caddy 2.5.2 in Docker with the following plugins enabled:

Any assistance would be appreciated.

Thanks

hslatman commented 2 years ago

Thank you for your report. I'll take a look soon.

Do you observe some temporal pattern to it happening?

tomcrawf90 commented 2 years ago

Thanks for the response @hslatman

So this appears to reoccur on an hourly basis. However there seems to be the odd occasion it appears on half hour intervals also. I get the error twice with the exception being logged below it. Then what I believe is docker-compose restarting the service based on having restart: always configured on the container.

Below is output from my log collection showing the frequency of the error.

 Sep 13 08:28:16   panic: runtime error: invalid memory address or nil pointer dereference
 Sep 13 08:28:19   panic: runtime error: invalid memory address or nil pointer dereference
 Sep 13 09:28:19   panic: runtime error: invalid memory address or nil pointer dereference
 Sep 13 09:28:22   panic: runtime error: invalid memory address or nil pointer dereference
 Sep 13 10:28:22   panic: runtime error: invalid memory address or nil pointer dereference
 Sep 13 10:28:26   panic: runtime error: invalid memory address or nil pointer dereference
 Sep 13 11:28:25   panic: runtime error: invalid memory address or nil pointer dereference
 Sep 13 11:28:29   panic: runtime error: invalid memory address or nil pointer dereference
 Sep 13 12:27:59   panic: runtime error: invalid memory address or nil pointer dereference
 Sep 13 12:28:02   panic: runtime error: invalid memory address or nil pointer dereference
 Sep 13 12:29:02   panic: runtime error: invalid memory address or nil pointer dereference
 Sep 13 12:29:06   panic: runtime error: invalid memory address or nil pointer dereference
 Sep 13 13:28:05   panic: runtime error: invalid memory address or nil pointer dereference
 Sep 13 13:28:09   panic: runtime error: invalid memory address or nil pointer dereference
 Sep 13 14:28:09   panic: runtime error: invalid memory address or nil pointer dereference
 Sep 13 14:28:11   panic: runtime error: invalid memory address or nil pointer dereference
 Sep 13 15:28:12   panic: runtime error: invalid memory address or nil pointer dereference
 Sep 13 15:28:15   panic: runtime error: invalid memory address or nil pointer dereference
 Sep 13 16:28:16   panic: runtime error: invalid memory address or nil pointer dereference
 Sep 13 16:28:18   panic: runtime error: invalid memory address or nil pointer dereference
 Sep 13 17:28:18   panic: runtime error: invalid memory address or nil pointer dereference
 Sep 13 17:28:22   panic: runtime error: invalid memory address or nil pointer dereference
 Sep 13 17:57:21   panic: runtime error: invalid memory address or nil pointer dereference
 Sep 13 17:57:24   panic: runtime error: invalid memory address or nil pointer dereference
 Sep 13 18:28:24   panic: runtime error: invalid memory address or nil pointer dereference
 Sep 13 18:28:27   panic: runtime error: invalid memory address or nil pointer dereference
hslatman commented 2 years ago

So, I think I know what's going on, but I'll need to verify if my understanding of Caddy configuration reloads and how these affect module (re)instantiation is correct.

The https://github.com/lucaslorentz/caddy-docker-proxy module is capable of performing automatic Caddy configuration reloads. When the configuration is reloaded, I think the CrowdSec module is stopped, resulting in the storage being set to nil. The background process running in a goroutine, however, is not stopped, so when it gets a new decision from the LAPI, it'll break. Docker will then restart the Caddy instance, resulting in normal operation again. Gracefully stopping the background goroutines that are connected to the LAPI is probably what I'll implement to fix this.

What I don't understand yet is why it happens at an hourly interval (mostly). The above hypothesis may be wrong, but from a quick skim this looks to be the way the *crowdSecStore is set to nil.

arashsm79 commented 1 year ago

Are there any temporary workarounds for this? Having the same issue with caddy-docker-proxy, caddy 2.7.4, and bouncer 0.3.1. Caddy crashes with the following error message and restarts whenever a decision is made.

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1738519]

goroutine 246 [running]:
github.com/hslatman/caddy-crowdsec-bouncer/internal/bouncer.(*crowdSecStore).delete(0x0, 0xc0007a50e0?)
        github.com/hslatman/caddy-crowdsec-bouncer@v0.3.1/internal/bouncer/store.go:77 +0x99
github.com/hslatman/caddy-crowdsec-bouncer/internal/bouncer.(*Bouncer).delete(...)
        github.com/hslatman/caddy-crowdsec-bouncer@v0.3.1/internal/bouncer/bouncer.go:152
github.com/hslatman/caddy-crowdsec-bouncer/internal/bouncer.(*Bouncer).Run.func1()
        github.com/hslatman/caddy-crowdsec-bouncer@v0.3.1/internal/bouncer/bouncer.go:95 +0x215
created by github.com/hslatman/caddy-crowdsec-bouncer/internal/bouncer.(*Bouncer).Run in goroutine 233
        github.com/hslatman/caddy-crowdsec-bouncer@v0.3.1/internal/bouncer/bouncer.go:86 +0x5f
tomcrawf90 commented 1 year ago

No, I run mine in docker so the container just restarts leaving me with very short downtime of a few seconds an hour. I can live with that but it's not really a workaround.

hslatman commented 12 months ago

A workaround that may be suitable is to disable the streaming mode (disable_streaming in the Caddyfile; "enable_streaming": true in the JSON format), which will result in Caddy making a request to the CrowdSec LAPI for every incoming request. It may not be the best solution if you get a lot of traffic, but at least it doesn't crash, since it doesn't rely on the *crowdSecStore.

That said, I'm working on some upgrades in https://github.com/hslatman/caddy-crowdsec-bouncer/pull/20 that improve the runtime. I think that's a good starting point for fixing this issue, so I'll do that after.

hslatman commented 10 months ago

In https://github.com/hslatman/caddy-crowdsec-bouncer/pull/23, which is part of https://github.com/hslatman/caddy-crowdsec-bouncer/releases/tag/v0.5.0, I've changed the behavior of the bouncer shutdown to not clean up the underlying store. This should prevent the nil pointer from happening when Caddy's configuration is reloaded. I've also changed the behavior of the goroutines that operate on the store, so they should be stopped in time now too.

I'm keeping the issue open a little longer to verify it's fixed and to have a more complete solution for when (parts of) the configuration change that affect the bouncer itself.

AskAlice commented 10 months ago

I had a similar issue here, but the trace looks different than what I see above

Nov 18 00:30:52 caddy caddy[4402]: panic: runtime error: invalid memory address or nil pointer dereference
Nov 18 00:30:52 caddy caddy[4402]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x1701b69]
Nov 18 00:30:52 caddy caddy[4402]: goroutine 820335 [running]:
Nov 18 00:30:53 caddy caddy[4402]: github.com/hslatman/caddy-crowdsec-bouncer/internal/bouncer.(*Bouncer).Run.func2()
Nov 18 00:30:53 caddy caddy[4402]:         github.com/hslatman/caddy-crowdsec-bouncer@v0.5.0/internal/bouncer/bouncer.go:123 +0xc9
Nov 18 00:30:53 caddy caddy[4402]: created by github.com/hslatman/caddy-crowdsec-bouncer/internal/bouncer.(*Bouncer).Run in goroutine 1591
Nov 18 00:30:53 caddy caddy[4402]:         github.com/hslatman/caddy-crowdsec-bouncer@v0.5.0/internal/bouncer/bouncer.go:114 +0xe5
hslatman commented 10 months ago

@AskAlice I've tagged https://github.com/hslatman/caddy-crowdsec-bouncer/releases/tag/v0.5.3. I couldn't reproduce the issue, but based solely on the version you used and the line numbers, I think this is the only cause for a nil pointer in this case.

hslatman commented 7 months ago

With the release of https://github.com/hslatman/caddy-crowdsec-bouncer/releases/tag/v0.6.0, which has more improvements related to configuration (re)loading and (re)instantiation of the module, this issue can be closed. While I haven't tested with https://github.com/lucaslorentz/caddy-docker-proxy specifically, any changes in configuration and (re)loading Caddy should now behave as expected. Feel free to reopen if the issue surfaces again.