Closed philipclaesson closed 1 year ago
Hey @philipclaesson ,
I think we have a missing part in OPAL that allows the /healthcheck
to fail.
To fix it with a workaround, you can set the OPAL_OPA_HEALTH_CHECK_POLICY_ENABLED
environment variable to true - in the Client's deployment.
This will should the /healthcheck
endpoint to work properly.
Lmk how it goes.
Thanks @RazcoDev! Setting OPAL_OPA_HEALTH_CHECK_POLICY_ENABLED
helped fixing the healthcheck endpoint.
However, the healthcheck fails. Looking at opa_client.py, it seems like this means that it is because either a data or policy transaction did not succeed or did not happen.
Looking at the data in http://localhost:8181/v1/data/system/opal
, it looks like there has been no data transactions:
{
"result": {
"healthy": false,
"last_data_transaction": {},
"last_failed_data_transaction": {},
"last_failed_policy_transaction": {},
"last_policy_transaction": {
"actions": [
"set_policies"
],
"creation_time": "2023-02-14T07:55:58.901524",
"end_time": "2023-02-14T07:55:58.955286",
"error": "",
"id": "b21a57c305783805bc28b4fc134cb5c27cda967b",
"remotes_status": [
{
"error": null,
"remote_url": "http://nv-authorization-server:7002/policy",
"succeed": true
}
],
"success": true,
"transaction_type": "policy"
},
"ready": false,
"transaction_data_statistics": {
"failed": 0,
"successful": 0
},
"transaction_policy_statistics": {
"failed": 0,
"successful": 1
}
}
}
I have not set any dataConfigSources yet, I'm assuming this is the problem?
dataConfigSources:
config:
entries: []
Setting OPAL_DATA_UPDATER_ENABLED
to false made the healthcheck pass and deployment succeed!
I think we have a missing part in OPAL that allows the
/healthcheck
to fail.
Would this missing part be the empty dataConfigSources? Or something else?
Yes, the moment you'll set data config sources these statistics will get updated. About the failing healthcheck, it's actually because of missing condition there, pretty simple thing, we'll take care of it.
Cool, thanks a lot for helping out!
Hey @RazcoDev!
I'm trying to deploy OPAL with default configuration using this helm chart v0.0.7 on AWS EKS. Kubernetes version in
v1.21.5-eks-9017834
This gives me three pods. The pgsql and server work fine but the client is not healthy.
Pulling the logs from the client pod shows me that it is crashing in the healthcheck method of client.py: https://github.com/permitio/opal/blob/master/packages/opal-client/opal_client/client.py#L212
I that understand the client is trying to query the healthcheck policy in OPA, and for some reason that data is not there.
OPA is up and running and I can reach the web interface and also run curl
/v1/data
or/v1/policies
. Howeverv1/data/system/opal/healthy
just times out.Any ideas of what could be the error here?