cvat-ai / cvat

Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.
https://cvat.ai
MIT License
12.58k stars 3k forks source link

OPAHealthCheck Internal Server Error for url: http://opa:8181/health?bundles #8451

Closed Shadowfear36 closed 1 month ago

Shadowfear36 commented 1 month ago

Actions before raising this issue

Steps to Reproduce

CVAT Version: CVAT:DEV Been running this for awhile now. Only decided to rebuild the container and since then this appears to be happening.

Error: OPAHealthCheck ... unknown error: 500 Server Error: Internal Server Error for url: http://opa:8181/health?bundles

Command used to build: sudo -E docker compose -f docker-compose.yml -f docker-compose.external_db.yml -f components/serverless/docker-compose.serverless.yml -f docker-compose.https.yml up --build --force-recreate

cvat_opa

{"level":"error","msg":"Bundle load failed: request failed: Get \"http://cvat-server:8080/api/auth/rules\": dial tcp: lookup cvat-server on 127.0.0.11:53: server misbehaving","name":"cvat","plugin":"bundle","time":"2024-09-17T21:15:00Z"}
{"level":"error","msg":"Bundle load failed: request failed: Get \"http://cvat-server:8080/api/auth/rules\": dial tcp: lookup cvat-server on 127.0.0.11:53: server misbehaving","name":"cvat","plugin":"bundle","time":"2024-09-17T21:15:00Z"}
{"level":"error","msg":"Bundle load failed: request failed: Get \"http://cvat-server:8080/api/auth/rules\": dial tcp: lookup cvat-server on 127.0.0.11:53: server misbehaving","name":"cvat","plugin":"bundle","time":"2024-09-17T21:15:00Z"}
{"level":"error","msg":"Bundle load failed: request failed: Get \"http://cvat-server:8080/api/auth/rules\": dial tcp: lookup cvat-server on 127.0.0.11:53: server misbehaving","name":"cvat","plugin":"bundle","time":"2024-09-17T21:15:00Z"}
{"level":"error","msg":"Bundle load failed: request failed: Get \"http://cvat-server:8080/api/auth/rules\": dial tcp: lookup cvat-server on 127.0.0.11:53: server misbehaving","name":"cvat","plugin":"bundle","time":"2024-09-17T21:15:01Z"}
{"level":"error","msg":"Bundle load failed: request failed: Get \"http://cvat-server:8080/api/auth/rules\": dial tcp 172.29.0.14:8080: connect: connection refused","name":"cvat","plugin":"bundle","time":"2024-09-17T21:15:02Z"}
{"level":"error","msg":"Bundle load failed: request failed: Get \"http://cvat-server:8080/api/auth/rules\": dial tcp 172.29.0.14:8080: connect: connection refused","name":"cvat","plugin":"bundle","time":"2024-09-17T21:15:03Z"}
{"level":"error","msg":"Bundle load failed: request failed: Get \"http://cvat-server:8080/api/auth/rules\": dial tcp 172.29.0.14:8080: connect: connection refused","name":"cvat","plugin":"bundle","time":"2024-09-17T21:15:05Z"}
{"level":"error","msg":"Bundle load failed: request failed: Get \"http://cvat-server:8080/api/auth/rules\": dial tcp 172.29.0.14:8080: connect: connection refused","name":"cvat","plugin":"bundle","time":"2024-09-17T21:15:07Z"}

Health Check

python manage.py health_check
WARNING:HEALTH:getting Cache backend: default health status
WARNING:HEALTH:getting Cache backend: media health status
WARNING:HEALTH:gettin OPA health status
WARNING:HEALTH:DONE getting Cache backend: default health status
ERROR:health-check:unknown error: 500 Server Error: Internal Server Error for url: http://opa:8181/health?bundles
Traceback (most recent call last):
  File "/home/django/cvat/apps/health/backends.py", line 29, in check_status
    response.raise_for_status()
  File "/opt/venv/lib/python3.8/site-packages/requests/models.py", line 953, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: http://opa:8181/health?bundles
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/opt/venv/lib/python3.8/site-packages/health_check/backends.py", line 30, in run_check
    self.check_status()
  File "/home/django/cvat/apps/health/backends.py", line 31, in check_status
    raise HealthCheckException(str(e))
health_check.exceptions.HealthCheckException: unknown error: 500 Server Error: Internal Server Error for url: http://opa:8181/health?bundles
WARNING:HEALTH:DONE getting Cache backend: media health status
Cache backend: default   ... working
Cache backend: media     ... working
DatabaseBackend          ... working
DiskUsage                ... working
MigrationsHealthCheck    ... working
OPAHealthCheck           ... unknown error: 500 Server Error: Internal Server Error for url: http://opa:8181/health?bundles

OpenPolicyAgent appears to be running.

 docker ps
CONTAINER ID   IMAGE                                       COMMAND                  CREATED          STATUS                    PORTS                                                                                                                                                            NAMES
c8ed8eaafca0   cvat/ui:dev                                 "/docker-entrypoint.…"   13 minutes ago   Up 13 minutes             80/tcp                                                                                                                                                           cvat_ui
db4f33da87eb   cvat/server:dev                             "./backend_entrypoin…"   13 minutes ago   Up 13 minutes             8080/tcp                                                                                                                                                         cvat_worker_quality_reports
4ee86010847f   cvat/server:dev                             "./backend_entrypoin…"   13 minutes ago   Up 13 minutes             8080/tcp                                                                                                                                                         cvat_utils
5656f3bf0a32   cvat/server:dev                             "./backend_entrypoin…"   13 minutes ago   Up 13 minutes             8080/tcp                                                                                                                                                         cvat_worker_analytics_reports
2b335307065c   cvat/server:dev                             "./backend_entrypoin…"   13 minutes ago   Up 13 minutes             8080/tcp                                                                                                                                                         cvat_server
d7dbe243a0bd   cvat/server:dev                             "./backend_entrypoin…"   13 minutes ago   Up 13 minutes             8080/tcp                                                                                                                                                         cvat_worker_import
bbf0b8c3e148   timberio/vector:0.26.0-alpine               "/usr/local/bin/vect…"   13 minutes ago   Up 13 minutes                                                                                                                                                                              cvat_vector
6d15378648f2   cvat/server:dev                             "./backend_entrypoin…"   13 minutes ago   Up 13 minutes             8080/tcp                                                                                                                                                         cvat_worker_webhooks
40faf79a3111   cvat/server:dev                             "./backend_entrypoin…"   13 minutes ago   Up 13 minutes             8080/tcp                                                                                                                                                         cvat_worker_annotation
8f6e68500767   cvat/server:dev                             "./backend_entrypoin…"   13 minutes ago   Up 13 minutes             8080/tcp                                                                                                                                                         cvat_worker_export
c596d96f3c43   quay.io/nuclio/dashboard:1.13.0-amd64       "/docker-entrypoint.…"   13 minutes ago   Up 13 minutes (healthy)   80/tcp, 0.0.0.0:8070->8070/tcp, :::8070->8070/tcp                                                                                                                nuclio
7b8c0bffc8d4   grafana/grafana-oss:10.1.2                  "sh -euc 'mkdir -p /…"   13 minutes ago   Up 13 minutes             3000/tcp                                                                                                                                                         cvat_grafana
026b6421eca6   redis:7.2.3-alpine                          "docker-entrypoint.s…"   13 minutes ago   Up 13 minutes             6379/tcp                                                                                                                                                         cvat_redis_inmem
62db1ec657cb   clickhouse/clickhouse-server:23.11-alpine   "/entrypoint.sh"         13 minutes ago   Up 13 minutes             8123/tcp, 9000/tcp, 9009/tcp                                                                                                                                     cvat_clickhouse
307193fa0112   traefik:v2.9                                "/entrypoint.sh trae…"   13 minutes ago   Up 13 minutes             0.0.0.0:80->80/tcp, :::80->80/tcp, 0.0.0.0:443->443/tcp, :::443->443/tcp, 0.0.0.0:8080->8080/tcp, :::8080->8080/tcp, 0.0.0.0:8090->8090/tcp, :::8090->8090/tcp   traefik
939fe31fa958   apache/kvrocks:2.7.0                        "kvrocks -c /var/lib…"   13 minutes ago   Up 13 minutes (healthy)   6666/tcp                                                                                                                                                         cvat_redis_ondisk
5f88ded404cb   openpolicyagent/opa:0.63.0                  "/opa run --server -…"   13 minutes ago   Up 13 minutes                                                                                                                                                                              cvat_opa
b9fb69b2b1f0   gcr.io/iguazio/alpine:3.17                  "/bin/sh -c '/bin/sl…"   23 minutes ago   Up 23 minutes                                                                                                                                                                              nuclio-local-storage-reader

Expected Behavior

I expected the built container to work properly. I'm not using a specific version of CVAT. Just using CVAT:DEV which could ultimately be my problem. But it's what was already established as I already have data both in a volume and on the database I need access to.

Possible Solution

No response

Context

No response

Environment

Docker version 24.0.5, build ced0996
Linux/Ubuntu
azhavoro commented 1 month ago

Please attach full log from cvat_server container docker logs cvat_server

Shadowfear36 commented 1 month ago

`docker logs cvat_server wait-for-it.sh: waiting for 192.111.0.11:5432 without a timeout wait-for-it.sh: 192.111.0.11:5432 is available after 0 seconds System check identified some issues:

WARNINGS: ?: (urls.W002) Your URL pattern '/' has a route beginning with a '/'. Remove this slash as it is unnecessary. If this pattern is targeted in an include(), ensure the include() pattern has a trailing '/'. Operations to perform: Apply all migrations: account, admin, analytics_report, auth, authtoken, contenttypes, dataset_repo, db, django_rq, engine, organizations, quality_control, sessions, sites, socialaccount, webhooks Running migrations: No migrations to apply.

163 static files copied to '/home/django/static'. wait-for-it.sh: waiting for 192.111.0.11:5432 without a timeout wait-for-it.sh: 192.111.0.11:5432 is available after 0 seconds waiting for migrations to complete... System check identified some issues:

WARNINGS: ?: (urls.W002) Your URL pattern '/' has a route beginning with a '/'. Remove this slash as it is unnecessary. If this pattern is targeted in an include(), ensure the include() pattern has a trailing '/'. 2024-09-17 21:35:50,531 INFO Creating socket unix:///tmp/uvicorn.sock 2024-09-17 21:35:50,531 INFO Closing socket unix:///tmp/uvicorn.sock 2024-09-17 21:35:50,532 INFO RPC interface 'supervisor' initialized 2024-09-17 21:35:50,532 CRIT Server 'unix_http_server' running without any HTTP authentication checking 2024-09-17 21:35:50,532 INFO supervisord started with pid 1 2024-09-17 21:35:51,535 INFO spawned: 'clamav_update' with pid 221 2024-09-17 21:35:51,537 INFO spawned: 'nginx-0' with pid 222 2024-09-17 21:35:51,538 INFO spawned: 'smokescreen' with pid 223 2024-09-17 21:35:51,538 INFO Creating socket unix:///tmp/uvicorn.sock 2024-09-17 21:35:51,539 DEBG fd 20 closed, stopped monitoring <PInputDispatcher at 140408520553904 for <Subprocess at 140408520542528 with name uvicorn-0 in state STARTING> (stdin)> 2024-09-17 21:35:51,540 INFO spawned: 'uvicorn-0' with pid 224 2024-09-17 21:35:51,541 DEBG fd 24 closed, stopped monitoring <PInputDispatcher at 140408520554480 for <Subprocess at 140408520542384 with name uvicorn-1 in state STARTING> (stdin)> 2024-09-17 21:35:51,541 INFO spawned: 'uvicorn-1' with pid 225 2024-09-17 21:35:51,542 DEBG fd 7 closed, stopped monitoring <POutputDispatcher at 140408520094480 for <Subprocess at 140408520094384 with name clamav_update in state STARTING> (stdout)> 2024-09-17 21:35:51,542 DEBG fd 9 closed, stopped monitoring <POutputDispatcher at 140408520552320 for <Subprocess at 140408520094384 with name clamav_update in state STARTING> (stderr)> 2024-09-17 21:35:51,542 INFO exited: clamav_update (exit status 0; expected) 2024-09-17 21:35:51,542 DEBG received SIGCHLD indicating a child quit 2024-09-17 21:35:51,542 DEBG 'nginx-0' stderr output: nginx: [alert] could not open error log file: open() "/var/log/nginx/error.log" failed (13: Permission denied)

2024-09-17 21:35:51,546 DEBG 'smokescreen' stderr output: {"level":"info","msg":"starting","time":"2024-09-17T21:35:51Z"}

2024-09-17 21:35:51,549 DEBG 'uvicorn-0' stderr output: wait-for-it.sh: waiting for 192.111.0.11:5432 without a timeout

2024-09-17 21:35:51,550 DEBG fd 14 closed, stopped monitoring <POutputDispatcher at 140408520552560 for <Subprocess at 140408520541808 with name nginx-0 in state STARTING> (stderr)> 2024-09-17 21:35:51,552 DEBG 'uvicorn-0' stderr output: wait-for-it.sh: 192.111.0.11:5432 is available after 0 seconds

2024-09-17 21:35:51,553 DEBG 'uvicorn-1' stderr output: wait-for-it.sh: waiting for 192.111.0.11:5432 without a timeout

2024-09-17 21:35:51,556 DEBG 'uvicorn-1' stderr output: wait-for-it.sh: 192.111.0.11:5432 is available after 0 seconds

2024-09-17 21:35:51,558 DEBG 'uvicorn-0' stderr output: wait-for-it.sh: waiting for cvat_redis_inmem:6378 without a timeout

2024-09-17 21:35:51,560 DEBG 'uvicorn-1' stderr output: wait-for-it.sh: waiting for cvat_redis_inmem:6378 without a timeout

2024-09-17 21:35:52,562 INFO success: nginx-0 entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 2024-09-17 21:35:52,562 INFO success: smokescreen entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 2024-09-17 21:35:52,562 INFO success: uvicorn-0 entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 2024-09-17 21:35:52,562 INFO success: uvicorn-1 entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)`

Shadowfear36 commented 1 month ago

Ok i managed to fix this issue. OPA was bugging out because it was trying to access a route that wasn't there in cvat_server.

http://cvat-server:8080/api/auth/rules\

Was not the correct url path.

It was this one:

http://cvat-server:8080/cvat/api/auth/rules\

in the docker-compose.yml

under cvat_opa change this:

 - --set=bundles.cvat.resource=/api/auth/rules

to this:

 - --set=bundles.cvat.resource=/cvat/api/auth/rules