Closed darkweaver87 closed 6 months ago
Hi,
Thanks for the interest in the plugin, we're discussing the issue you encountered with @maxlerebourg.
l280 bouncer.go
// Right here if we cannot join the stream we forbid the request to go on.
if bouncer.crowdsecMode == configuration.StreamMode || bouncer.crowdsecMode == configuration.AloneMode {
if isCrowdsecStreamHealthy {
handleNextServeHTTP(bouncer, remoteIP, rw, req)
} else {
bouncer.log.Debug(fmt.Sprintf("ServeHTTP isCrowdsecStreamHealthy:false ip:%s", remoteIP))
handleBanServeHTTP(bouncer, rw)
}
}
I'm thinking about an internal counter, that allows X number of time the stream to be unhealthy before going to 403 requests.
So the updateInterval
multiplied by the counter, would allow that grace period.
With some default variable exemple:
streamUnhealthyMaxTime=3
UpdateIntervalSeconds=60
So instead of blocking at 1 min if the LAPI is unreacheable, it would be blocked after 3 min.
A successfull sync with the LAPI would reset that counter
Hello,
Thank you for your feedback :-) Looks good to me :-)
Thanks :+1:
Rémi
Hi,
We're almost done implementing it, I have tested basic behavior yesterday:
We should merge and release a beta version very soon.
Is your feature request related to a problem? Please describe. 🐛 We implemented a PoC with your wonderful plugin and we would like to put it in production but we still have one remaining issue using
stream
mode (but using other mode don't change anything).Crowdsec free deployment relies on some agents sending their decisions to a local API. This LAPI can't be scaled by design as this will mean agents will potentially try to send their data to an LAPI they are not registered on.
Consequently, this means that's technically speaking we can "lose" the LAPI for a given amount of time and it can be unavailable during the cache refresh. If it's the case then Traefik returns a 403.
Even if I tend to agree that it's a good security practice to block when their is a doubt on some services that's not really ideal. In my case, I need to allow users to access the service on such a failure.
Describe the solution you'd like ✨
Thus, I was thinking about either:
I will be happy to contribute, just let me know your thoughts on this :-)
Additional context