Closed UffeLauge closed 8 months ago
It does not look like your model was healthy deployed. I would suggest to check docker container and its logs.
It does not look like your model was healthy deployed. I would suggest to check docker container and its logs.
Okay, I have no errors in the yolov8 docker logs. However I get the following errors in the cvat docker logs, do you have suggestions on what to change?
2024-02-21 10:38:52 traefik | {"level":"error","msg":"no valid entryPoint for this router","routerName":"grafana_https@file","time":"2024-02-21T09:38:52Z"}
2024-02-21 10:38:52 traefik | {"entryPointName":"websecure","level":"error","msg":"entryPoint \"websecure\" doesn't exist","routerName":"grafana_https@file","time":"2024-02-21T09:38:52Z"}
2024-02-21 10:38:52 traefik | {"level":"error","msg":"no valid entryPoint for this router","routerName":"grafana_https@file","time":"2024-02-21T09:38:52Z"}
2024-02-21 10:38:54 traefik | {"entryPointName":"websecure","level":"error","msg":"entryPoint \"websecure\" doesn't exist","routerName":"grafana_https@file","time":"2024-02-21T09:38:54Z"}
2024-02-21 10:38:54 traefik | {"level":"error","msg":"no valid entryPoint for this router","routerName":"grafana_https@file","time":"2024-02-21T09:38:54Z"}
2024-02-21 10:38:54 traefik | {"entryPointName":"websecure","level":"error","msg":"entryPoint \"websecure\" doesn't exist","routerName":"grafana_https@file","time":"2024-02-21T09:38:54Z"}
2024-02-21 10:38:54 traefik | {"level":"error","msg":"no valid entryPoint for this router","routerName":"grafana_https@file","time":"2024-02-21T09:38:54Z"}
2024-02-21 10:38:56 traefik | {"entryPointName":"websecure","level":"error","msg":"entryPoint \"websecure\" doesn't exist","routerName":"grafana_https@file","time":"2024-02-21T09:38:56Z"}
2024-02-21 10:38:56 traefik | {"level":"error","msg":"no valid entryPoint for this router","routerName":"grafana_https@file","time":"2024-02-21T09:38:56Z"}
2024-02-21 10:38:56 traefik | {"entryPointName":"websecure","level":"error","msg":"entryPoint \"websecure\" doesn't exist","routerName":"grafana_https@file","time":"2024-02-21T09:38:56Z"}
2024-02-21 10:38:52 cvat_vector | 2024-02-21T09:38:52.819587Z WARN http: vector::internal_events::http_client: HTTP error. error=error trying to connect: tcp connect error: Connection refused (os error 111) error_type="request_failed" stage="processing" internal_log_rate_limit=true
2024-02-21 10:38:52 cvat_vector | 2024-02-21T09:38:52.819739Z ERROR vector::topology::builder: msg="Healthcheck: Failed Reason." error=Failed to make HTTP(S) request: error trying to connect: tcp connect error: Connection refused (os error 111) component_kind="sink" component_type="clickhouse" component_id=clickhouse component_name=clickho
2024-02-21 10:39:08 cvat_server | nginx: [alert] could not open error log file: open() "/var/log/nginx/error.log" failed (13: Permission denied)
2024-02-21 10:39:08 cvat_server |
2024-02-21 10:39:08 cvat_server | 2024-02-21 09:39:08,754 DEBG 'smokescreen' stderr output:
2024-02-21 10:39:08 cvat_server | {"level":"info","msg":"starting","time":"2024-02-21T09:39:08Z"}
2024-02-21 10:39:08 cvat_server |
2024-02-21 10:39:08 cvat_server | 2024-02-21 09:39:08,757 DEBG 'uvicorn-0' stderr output:
2024-02-21 10:39:08 cvat_server | wait-for-it.sh: waiting for cvat_db:5432 without a timeout
1:8080: connect: connection refused","name":"cvat","plugin":"bundle","time":"2024-02-21T09:38:55Z"}
2024-02-21 10:38:55 cvat_opa | {"level":"error","msg":"Bundle load failed: request failed: Get \"http://cvat-server:8080/api/auth/rules\": dial tcp 172.23.0.11:8080: connect: connection refused","name":"cvat","plugin":"bundle","time":"2024-02-21T09:38:55Z"}
2024-02-21 10:38:55 cvat_opa | {"level":"error","msg":"Bundle load failed: request failed: Get \"http://cvat-server:8080/api/auth/rules\": dial tcp 172.23.0.11:8080: connect: connection refused","name":"cvat","plugin":"bundle","time":"2024-02-21T09:38:55Z"}
2024-02-21 10:38:55 cvat_opa | {"level":"error","msg":"Bundle load failed: request failed: Get \"http://cvat-server:8080/api/auth/rules\": dial tcp 172.23.0.11:8080: connect: connection refused","name":"cvat","plugin":"bundle","time":"2024-02-21T09:38:55Z"}
2024-02-21 10:38:55 cvat_opa | {"level":"error","msg":"Bundle load failed: request failed: Get \"http://cvat-server:8080/api/auth/rules\": dial tcp 172.23.0.11:8080: connect: connection refused","name":"cvat","plugin":"bundle","time":"2024-02-21T09:38:55Z"}
2024-02-21 10:49:08 cvat_grafana | logger=cleanup t=2024-02-21T09:49:08.361818152Z level=info msg="Completed cleanup jobs" duration=15.293741ms
2024-02-21 10:49:08 cvat_grafana | logger=sqlstore.transactions t=2024-02-21T09:49:08.369970581Z level=info msg="Database locked, sleeping then retrying" error="database is locked" retry=1 code="database is locked"
2024-02-21 10:49:08 cvat_grafana | logger=grafana.update.checker t=2024-02-21T09:49:08.436252928Z level=info msg="Update check succeeded" duration=37.765296ms
2024-02-21 10:49:08 cvat_grafana | logger=plugins.update.checker t=2024-02-21T09:49:08.520970967Z level=info msg="Update check succeeded" duration=81.64539ms
2024-02-21 10:54:06 cvat_grafana | logger=sqlstore.transactions t=2024-02-21T09:54:06.399300934Z level=info msg="Database locked, sleeping then retrying" error="database is locked" retry=0 code="database is locked"
You should start from checking this error:
nginx: [alert] could not open error log file: open() "/var/log/nginx/error.log" failed (13: Permission denied)
Go to cvat container and check permissions for that file.
You may login as root inside the container: docker exec -u root -it cvat_server bash
But I am not sure that is the reason.
Somewhy cvat_opa
can't get bundle from cvat_server
container. However I do not see something else suspicious in provided cvat_server
logs.
2024-02-21 10:38:55 cvat_opa | {"level":"error","msg":"Bundle load failed: request failed: Get \"http://cvat-server:8080/api/auth/rules\": dial tcp 172.23.0.11:8080: connect: connection refused","name":"cvat","plugin":"bundle","time":"2024-02-21T09:38:55Z"}
Refused usually means, that the server is not up on the specified port.
And finally I do not see any logs corresponding to this error:
But with code 500 there should be some exceptions in the docker logs.
You should start from checking this error:
nginx: [alert] could not open error log file: open() "/var/log/nginx/error.log" failed (13: Permission denied)
Go to cvat container and check permissions for that file. You may login as root inside the container:
docker exec -u root -it cvat_server bash
I have investigated the permissions, with the following result:
And finally I do not see any logs corresponding to this error:
But with code 500 there should be some exceptions in the docker logs.
Sadly, I have not been able to find any exceptions in the logs.
I have investigated the permissions, with the following result:
Try to remove this file maybe, it should be re-created with correct permissions
I have investigated the permissions, with the following result:
Try to remove this file maybe, it should be re-created with correct permissions
I tried deleting the file and restarting the CVAT Dockers, it just recreated the file with the same permissions.
And you still do not see any errors in docker logs cvat_server
?
www-data
and adm
don't correspond to django
user and django
group, we use by default.
So, I may conclude you are using a modified version of CVAT.
www-data
andadm
don't correspond todjango
user anddjango
group, we use by default. So, I may conclude you are using a modified version of CVAT.
That sound weird to me. All I have done is to follow your installation guide from: https://opencv.github.io/cvat/docs/administration/basics/installation/ And followed this guide afterwards: https://opencv.github.io/cvat/docs/administration/advanced/installation_automatic_annotation/
And you still do not see any errors in
docker logs cvat_server
?
I still get the same errors as in my initial comment. Removing the /var/log/nginx/error.log didn't seem to have any effect.
Okay, never mind.
To suggest something, I need the full log from cvat_server
, not just a short fragment.
Not sure this will help For me, i just chmod 777 for that file(bad practice) After that i can see the error better which is related to my serverless container After fix serverless code no issue anymore
This is just my assumption, the reason for new error log got different username probably because the server that got 500 error user is www-data(maybe)
@bsekachev You marked this thread as completed. @UffeLauge Did you manage to solve the issue? I am trying to use a custom yolov8n-seg.pt model for auto annotation in CVAT. I installed nuclio and the serverless functions for CVAT. Segment anything (SAM) already works fine and I can use it in CVAT. My custom yolov8n-seg.pt is recognized by CVAT and I can select it for auto-annotation. When I click on task -> auto annotation it starts the process to auto-annotate. However, I get an error message. The model runs and detects something on interference but somehow the format does not work.
Error message (docker logs -f 1dcb9483a72a):
0: 2112x2496 39 Potatos, 1427.3ms Speed: 24.6ms preprocess, 1427.3ms inference, 931.1ms postprocess per image at shape (1, 3, 2112, 2496) 24.05.21 11:49:40.483 (E) sor.http.w0.python.logger Exception caught in handler {"worker_id": "0", "exc": "'tuple' object has no attribute 'xyxy'", "traceback": "Traceback (most recent call last):\n File \"/opt/nuclio/_nuclio_wrapper.py\", line 151, in serve_requests\n await self._handle_event(event)\n File \"/opt/nuclio/_nuclio_wrapper.py\", line 439, in _handle_event\n entrypoint_output = self._entrypoint(self._context, event)\n File \"/opt/nuclio/main.py\", line 50, in handler\n xyxy = detection.xyxy\nAttributeError: 'tuple' object has no attribute 'xyxy'\n"}
This is my "main.py" script:
` import json import base64 from PIL import Image import io
import numpy as np from ultralytics import YOLO import supervision as sv from skimage.measure import approximate_polygon, find_contours
def to_cvat_mask(box: list, mask): xtl, ytl, xbr, ybr = box flattened = mask[ytl:ybr + 1, xtl:xbr + 1].flat[:].tolist() flattened.extend([xtl, ytl, xbr, ybr]) return flattened
def init_context(context): context.logger.info("Init context... 0%")
model_path = "yolov8n-seg.pt"
model = YOLO(model_path, task="segment")
# Read the DL model
context.user_data.model = model
context.logger.info("Init context...100%")
def handler(context, event): context.logger.info("Run yolo-v8 model") data = event.body buf = io.BytesIO(base64.b64decode(data["image"])) threshold = float(data.get("threshold", 0.5)) context.user_data.model.conf = threshold image = Image.open(buf)
yolo_results = context.user_data.model(image, conf=threshold)[0]
labels = yolo_results.names
detections:sv.Detections = sv.Detections.from_ultralytics(yolo_results)
detections = detections[detections.confidence > threshold]
results = []
if len(detections) > 0:
for detection in detections:
xyxy = detection.xyxy
mask = detection.mask
confidence = detection.confidence
class_id = detection.class_ide
mask = mask.astype(np.uint8)
xtl = int(xyxy[0])
ytl = int(xyxy[1])
xbr = int(xyxy[2])
ybr = int(xyxy[3])
label = int(class_id)
cvat_mask = to_cvat_mask((xtl, ytl, xbr, ybr), mask)
contours = find_contours(mask, 0.5)
contour = contours[0]
contour = np.flip(contour, axis=1)
polygons = approximate_polygon(contour, tolerance=2.5)
results.append({
"confidence": str(confidence),
"label": labels.get(class_id, "unknown"),
"type": "mask",
"points": polygons.ravel().tolist(),
"mask": cvat_mask,
})
return context.Response(body=json.dumps(results), headers={},
content_type='application/json', status_code=200)
`
Solved it myself. The issue was that I did not handle the Detections
object correctly. I was treating detection as if it has the xyxy attribute directly, but detection is actually a tuple.
This is the corrected handler function:
def handler(context, event):
context.logger.info("Run yolo-v8 model")
data = event.body
buf = io.BytesIO(base64.b64decode(data["image"]))
threshold = float(data.get("threshold", 0.5))
context.user_data.model.conf = threshold
image = Image.open(buf)
yolo_results = context.user_data.model(image, conf=threshold)[0]
labels = yolo_results.names
detections = sv.Detections.from_ultralytics(yolo_results)
detections = detections[detections.confidence > threshold]
results = []
if len(detections) > 0:
for i in range(len(detections)):
xyxy = detections.xyxy[i]
mask = detections.mask[i]
confidence = detections.confidence[i]
class_id = detections.class_id[i]
mask = mask.astype(np.uint8)
xtl = int(xyxy[0])
ytl = int(xyxy[1])
xbr = int(xyxy[2])
ybr = int(xyxy[3])
label = int(class_id)
cvat_mask = to_cvat_mask((xtl, ytl, xbr, ybr), mask)
contours = find_contours(mask, 0.5)
contour = contours[0]
contour = np.flip(contour, axis=1)
polygons = approximate_polygon(contour, tolerance=2.5)
results.append({
"confidence": str(confidence),
"label": labels.get(class_id, "unknown"),
"type": "mask",
"points": polygons.ravel().tolist(),
"mask": cvat_mask,
})
return context.Response(body=json.dumps(results), headers={},
content_type='application/json', status_code=200)
I created a repository: https://github.com/felixkarevo/CVAT-custom-yolov8-segmentation-auto-annotation
Actions before raising this issue
Steps to Reproduce
Expected Behavior
No response
Possible Solution
No response
Context
I am trying to use my custom trained model to auto annotate in CVAT Serverless. I have placed my model files in: cvat/serverless/pytorch/ultralytics/yolov8. In this folder i have placed: function.yaml, function_cpu.yaml, main.py and best.pt (the yolov8 weight file) main.py:
function.yaml:
Environment
No response