Closed ImranR98 closed 1 week ago
@ImranR98: Thanks for opening an issue, it is currently awaiting triage.
If you haven't already, please provide the following information:
bug
, enhancement
or documentation
agent
, appsec
, configuration
, cscli
, local-api
In the meantime, you can:
@ImranR98: There are no 'kind' label on this issue. You need a 'kind' label to start the triage process.
/kind bug
/kind documentation
/kind enhancement
/kind documentation /area local-api
Hi, the solution is to check in the chart if the replica is enabled ( more than 1) then add suffix the env var CUSTOM_HOSTNAME
with an index.
Discussed with @blotus.
I'm not sure I understand, but glad to see there's a PR to fix it :rocket:
Just to clarify, does this mean that - even without the PR you made - Crowdsec is actually working as expected aside from cscli
availability? I assumed the lack of cscli
access meant there was something else wrong with the pod.
I'm not sure I understand, but glad to see there's a PR to fix it 🚀 Just to clarify, does this mean that - even without the PR you made - Crowdsec is actually working as expected aside from
cscli
availability? I assumed the lack ofcscli
access meant there was something else wrong with the pod.
So a not so tldr;
When the LAPI pods come up because they need to have working credentials they execute a direct machine add command and by default the container choose the name "localhost" as by the default value for CUSTOM_HOSTNAME
. Since both LAPI's are using the same name within the startup script they delete the previous LAPI credentials that were just registered (because it believes itself to be unique and if the name already exists it thinks that the LAPI pod has been deleted and the credentials have been lost) , hence why you have one LAPI that works with cscli
and another that does not.
The side effect is that one of the LAPIs
will work for a couple of hours due to the JWT token being valid and once the token expires that LAPI
will start to get authentication errors since the previously registered username and password does now not exist within the database.
The fix, we now force each LAPI to have a unique name by using the pod metadata of the randomly generated name, this will stop the name collision.
Okay that makes sense, thanks for the explanation!
I've been trying to get this to work in a small testing environment with Traefik. My current config seems to work fine with a single LAPI pod backed by a Postgres DB and connected to 2 agents on 2 nodes.
But if I try setting the
lapi.replicas
value to2
, I get the following error in one of the two pods when I try to run acscli
command (like cscli decisions list
):level=fatal msg="unable to retrieve decisions: performing request: Get \"http://localhost:8080/v1/alerts?has_active_decision=true&include_capi=false&limit=100\": API error: incorrect Username or Password" command terminated with exit code 1
This is my
values.yaml
:My assumption was that since I have disabled persistent volumes and configured a DB instead, both LAPI instances would connect to the same DB and have no issues. But I've clearly misunderstood how everything fits together. Would appreciate anyone pointing me in the right direction!