Closed delgado23 closed 2 months ago
Hello,
Do you have another script or program running simultaneously (even another instance of the bouncer) that would perform operations on the KV store? The error makes me think the KV namespace somehow got deleted (the bouncer will only try to delete it on shutdown or startup to clean up potential leftover resources).
If you go to the cloudflare console when this error occurs, do you see the CROWDSECCFBOUNCERNS
namespace ?
@blotus Yes I have two servers protecting different sites. Next time this happens I can see if the CROWDSECCFBOUNCERNS namespace is gone or not. Currently, it is there because I restarted the services.
Are those two servers interacting with the same Cloudflare account ? If so, that's probably the issue. I don't think we ever tested this configuration, but I would be very surprised if it worked.
The name of the KV store is currently hardcoded in the bouncer, meaning that when a new instance starts, it will see the existing namespace and try to delete it.
You can test this by just restarting one of the two bouncers, and the other one should start to misbehave the next time it tries to delete a decision.
@blotus It looks like it started happening again on one server and not the other one yet. These are indeed both on the same Cloudflare account.
server 1
systemctl status crowdsec-cloudflare-worker-bouncer.service
● crowdsec-cloudflare-worker-bouncer.service - CrowdSec bouncer for Cloudflare
Loaded: loaded (/usr/lib/systemd/system/crowdsec-cloudflare-worker-bouncer.service; enabled; preset: disabled)
Active: active (running) since Wed 2024-09-18 09:28:16 EDT; 6h ago
Process: 806863 ExecStartPre=/usr/bin/crowdsec-cloudflare-worker-bouncer -c /etc/crowdsec/bouncers/crowdsec-cloudflare-worker-bouncer.yaml -t (code=exited, status=0/SUCCESS)
Main PID: 806874 (crowdsec-cloudf)
Tasks: 14 (limit: 74593)
Memory: 65.7M
CPU: 1min 1.097s
CGroup: /system.slice/crowdsec-cloudflare-worker-bouncer.service
└─806874 /usr/bin/crowdsec-cloudflare-worker-bouncer -c /etc/crowdsec/bouncers/crowdsec-cloudflare-worker-bouncer.yaml
Sep 18 09:28:16 overwatch.garaventaville.com systemd[1]: Starting CrowdSec bouncer for Cloudflare...
Sep 18 09:28:16 overwatch.garaventaville.com systemd[1]: Started CrowdSec bouncer for Cloudflare.
level=error msg="account , unable to process deleted decisions: bulk remove keys: 'namespace not found' (10013)"
level=error msg="The internal cache of the bouncer is now likely out of sync, and likely needs a restart"
level=error msg="If this error persists, please open an issue on https://github.com/crowdsecurity/cs-cloudflare-worker-bouncer/issues"
Server 2
systemctl status crowdsec-cloudflare-worker-bouncer.service
● crowdsec-cloudflare-worker-bouncer.service - CrowdSec bouncer for Cloudflare
Loaded: loaded (/usr/lib/systemd/system/crowdsec-cloudflare-worker-bouncer.service; enabled; preset: disabled)
Active: active (running) since Wed 2024-09-18 12:20:53 EDT; 3h 24min ago
Process: 2624 ExecStartPre=/usr/bin/crowdsec-cloudflare-worker-bouncer -c /etc/crowdsec/bouncers/crowdsec-cloudflare-worker-bouncer.yaml -t (code=exited, status=0/SUCCESS)
Main PID: 2639 (crowdsec-cloudf)
Tasks: 10 (limit: 23106)
Memory: 76.2M
CPU: 44.371s
CGroup: /system.slice/crowdsec-cloudflare-worker-bouncer.service
└─2639 /usr/bin/crowdsec-cloudflare-worker-bouncer -c /etc/crowdsec/bouncers/crowdsec-cloudflare-worker-bouncer.yaml
time="2024-09-18T15:25:37-04:00" level=info msg="Received 150 deleted decisions"
time="2024-09-18T15:25:37-04:00" level=info msg="Deleting 150 decisions" account=
time="2024-09-18T15:25:37-04:00" level=info msg="Deleted 150 decisions" account=
I still see the CROWDSECCFBOUNCERNS KV as seen in the attached screenshot.
This will be unsupported to have 2 cloudflare workers interacting with the same account, you can already define multiple zones within one configuration file (hence 2 should be not needed). As pointed out by @blotus since the second one will see the same name property it will delete and publish a new namespace which will alter the generated ID. When the first one uses the old ID property it has in memory to update the properties it will hit an issue that the namespace doesnt exist anymore.
The fix is to use one remediation and define multiple zones within the same configuration file.
Thank you @LaurenceJJones I will give that a try.
I started seeing this error in my logs a couple of days ago.
When I restart the bouncer it starts working again for a about a day or so then it starts throwing this error message again.