permitio / opal

Policy and data administration, distribution, and real-time updates on top of Policy Agents (OPA, Cedar, ...)
https://opal.ac
Apache License 2.0
4.14k stars 157 forks source link

Healthcheck always returns false with option `OPAL_DATA_UPDATER_ENABLED=False` #181

Closed gdmarsh closed 2 years ago

gdmarsh commented 2 years ago

I was looking at running the healthcheck policy as it is useful configuring this for a Kubernetes environment but there is also no requirement currently to update data, only policy. It turns out that by disabling data updates using the OPAL_DATA_UPDATER_ENABLED option it causes the healthcheck to never return true.

Steps to reproduce

  1. Update docker-compose-with-callbacks.yml to disable data updates (gdmarsh/opal@44e6db9)
  2. Start docker-compose
  3. Observe client logs and/or curl v1/data/system/opal to show that ready and healthy both return false and will never end up being true
opal_client.opa.runner                  | INFO  | Running OPA initial start callbacks
opal_client.policy_store.opa_client     | INFO  | persisting health check policy: ready=false, healthy=false
opal_client.policy_store.opa_client     | INFO  | Policy and data statistics: policy: (successful 0, failed 0); data: (successful 0, failed 0)
opal_client.opa.logger                  | INFO  | Received request.    PUT /v1/policies/opa/healthcheck/opal.rego
opal_client.opa.logger                  | INFO  | Sent response.       PUT /v1/policies/opa/healthcheck/opal.rego -> 200
opal_client.policy.updater              | INFO  | Launching policy updater
opal_client.policy.updater              | INFO  | Subscribing to topics: ['policy:.']
fastapi_websocket_pubsub.pub_sub_client | INFO  | Trying to connect to Pub/Sub server - ws://opal_server:7002/ws
fastapi_websocket_rpc.websocket_rpc_c...| INFO  | Trying server - ws://opal_server:7002/ws
opal_client.policy.updater              | INFO  | Connected to server
opal_client.policy.updater              | INFO  | Refetching policy code (full bundle)
opal_client.policy.updater              | INFO  | Got policy bundle with 2 rego files, 1 data files, commit hash: '6d849b1ce92bae05c31146470e66e07c3c41b164'
opal_client.opa.logger                  | INFO  | Received request.    GET /v1/policies
opal_client.opa.logger                  | INFO  | Sent response.       GET /v1/policies -> 200
opal_client.opa.logger                  | INFO  | Received request.    PUT /v1/data
opal_client.opa.logger                  | INFO  | Sent response.       PUT /v1/data -> 204
opal_client.opa.logger                  | INFO  | Received request.    PUT /v1/policies/utils.rego
opal_client.opa.logger                  | INFO  | Sent response.       PUT /v1/policies/utils.rego -> 200
opal_client.opa.logger                  | INFO  | Received request.    PUT /v1/policies/rbac.rego
opal_client.opa.logger                  | INFO  | Sent response.       PUT /v1/policies/rbac.rego -> 200
opal_client.policy_store.opa_client     | INFO  | processing store transaction: {'id': '6d849b1ce92bae05c31146470e66e07c3c41b164', 'actions': ['set_policies'], 'transaction_type': <TransactionType.policy: 'policy'>, 'success': True, 'error': '', 'creation_time': '2021-11-10T13:44:07.012526', 'end_time': '2021-11-10T13:44:07.031347', 'remotes_status': [{'remote_url': 'http://opal_server:7002/policy', 'succeed': True, 'error': None}]}
opal_client.policy_store.opa_client     | INFO  | persisting health check policy: ready=false, healthy=false
opal_client.policy_store.opa_client     | INFO  | Policy and data statistics: policy: (successful 1, failed 0); data: (successful 0, failed 0)
opal_client.opa.logger                  | INFO  | Received request.    PUT /v1/policies/opa/healthcheck/opal.rego
opal_client.opa.logger                  | INFO  | Sent response.       PUT /v1/policies/opa/healthcheck/opal.rego -> 200
fastapi_websocket_pubsub.pub_sub_client | INFO  | Connected to PubSub server ws://opal_server:7002/ws
asafc commented 2 years ago

oh that is a good catch @gdmarsh :)

The way the healthcheck policy healthy rule works, it checks both the success of at least one update made by both policy updater and data updater - since data updater is off - no data updates are ever made.

It's a super easy fix - will get to it this weekend :)