Unable to Import COCO, PASCAL, YOLO annotations - Error 500 [Urgent Support Requested]

GitarthVaishnav commented 1 year ago

My actions before raising this issue

[x] Read/searched the docs
[x] Searched past issues

When I try to import annotations for a small dataset, CVAT online as well as the locally hosted version throws an error 500 with the error

datumaro.components.errors.datasetnotfounderror: failed to find dataset at '/home/django/data/projects/1/tmp/tmptsnzb3r6'\n"..

in the console.

Steps to Reproduce (for bugs)

Create a Project "XYZ" with 5 labels.
Create a Task and upload images directly (no archive) with local storage option and 100% image quality.
Open the job and upload annotations (archive with just annotation files or JSON file) - I have coco json, yolo, and pascal voc xml annotations - which need to be verified and updated.
- The YOLO data are in the format: images | 1.jpg | 2.jpg ... labels | 1.txt | 2.txt ... data.yaml data.names obj.data train.txt
- The COCO JSON data is in the same format except that the images folder has a file named _annotations.json, and there are no labels, data.yaml, etc files and folders.
- The PASCAL VOC XML data is in the same format except that the images folder has 1.xml, 2.xml etc files and there are no labels, data.yaml, etc files and folders.
CVAT will take the annotations in all formats, and throw the same error after some time and ask to check console.
This is happening with many datasets, also sometimes for 5-7 images it works!
Annotations are not flawed - model trains, tested on roboflow, makesense, and labelimg and it works! Want to use CVAT for better processing as always.

Expected Behaviour

Old behaviour was that when I used CVAT earlier (a couple of months), with the same dataset, it used to take the old yolo format archived labels folder with data.yaml. This does not work.
Expected behaviour is that with the correct dataset for all formats, it should accept and display the annotations.

Current Behaviour

error 500

datumaro.components.errors.datasetnotfounderror: failed to find dataset at '/home/django/data/projects/1/tmp/tmptsnzb3r6'\n"..

OR

Possible Solution

No Idea!

Context

Annotating data for a project, need urgent support.

Your Environment

Git hash commit (git log -1): commit 2709802b7fd365554188a2b7c7c75bbd42a6e55b (HEAD -> develop, origin/develop, origin/HEAD) Author: Maxim Zhiltsov zhiltsov.max35@gmail.com Date: Wed Apr 5 01:59:25 2023 +0300 Fix cloud storage permissions (#5956)

Docker version docker version (e.g. Docker 17.0.05):

Client:
Cloud integration: v1.0.31
Version:           20.10.23
API version:       1.41
Go version:        go1.18.10
Git commit:        7155243
Built:             Thu Jan 19 17:35:19 2023
OS/Arch:           darwin/amd64
Context:           default
Experimental:      true
Server: Docker Desktop 4.17.0 (99724)
Engine:
Version:          20.10.23
API version:      1.41 (minimum version 1.12)
Go version:       go1.18.10
Git commit:       6051f14
Built:            Thu Jan 19 17:32:04 2023
OS/Arch:          linux/amd64
Experimental:     false
containerd:
Version:          1.6.18
GitCommit:        2456e983eb9e37e47538f59ea18f2043c9a73640
runc:
Version:          1.1.4
GitCommit:        v1.1.4-0-g5fd4c4d
docker-init:
Version:          0.19.0
GitCommit:        de40ad0

Are you using Docker Swarm or Kubernetes? Docker
Operating System and version (e.g. Linux, Windows, MacOS): MacOS 13.3 - 2.6 GHz 6-Core Intel Core i7
Code example or link to GitHub repo or gist to reproduce problem: NA
Other diagnostic information / logs:

```[2023-04-05 05:40:33,643] WARNING django.request: Unauthorized: /api/events [2023-04-05 05:40:54,471] WARNING django.request: Unauthorized: /api/auth/password/change [2023-04-05 05:40:55,150] WARNING django.request: Not Found: /api/user-agreements [2023-04-05 05:41:16,658] WARNING django.request: Not Found: /api/functions/requests/ [2023-04-05 05:43:36,643] INFO cvat.server.project_1: label:create Label id:1 for spec:OrderedDict([('name', 'Ready'), ('type', 'any'), ('color', '#689981')]) with sublabels:[], parent_label:None [2023-04-05 05:43:36,654] INFO cvat.server.project_1: label:create Label id:2 for spec:OrderedDict([('name', 'empty_pod'), ('type', 'any'), ('color', '#a38aca')]) with sublabels:[], parent_label:None [2023-04-05 05:43:36,666] INFO cvat.server.project_1: label:create Label id:3 for spec:OrderedDict([('name', 'germination'), ('type', 'any'), ('color', '#dff59f')]) with sublabels:[], parent_label:None [2023-04-05 05:43:36,678] INFO cvat.server.project_1: label:create Label id:4 for spec:OrderedDict([('name', 'pod'), ('type', 'any'), ('color', '#4a8cf4')]) with sublabels:[], parent_label:None [2023-04-05 05:43:36,690] INFO cvat.server.project_1: label:create Label id:5 for spec:OrderedDict([('name', 'young'), ('type', 'any'), ('color', '#7ea09e')]) with sublabels:[], parent_label:None [2023-04-05 05:43:37,276] WARNING django.request: Not Found: /api/functions/requests/ [2023-04-05 05:44:48,373] INFO cvat.server: create task #1 [2023-04-05 05:46:17,244] INFO cvat.server: Found frames 1510 for Data #1 [2023-04-05 05:46:17,245] INFO cvat.server: New segment for task #1: idx = 0, start_frame = 0, stop_frame = 1509 [2023-04-05 05:46:18,390] INFO cvat.server.task_1: get repository request [2023-04-05 05:46:28,773] WARNING django.request: Not Found: /api/functions/requests/ [2023-04-05 05:46:31,671] INFO cvat.server.task_1: get repository request [2023-04-05 05:47:26,321] ERROR django.request: Internal Server Error: /api/jobs/1/annotations [2023-04-05 05:47:51,966] WARNING django.request: Not Found: /api/user-agreements ```

zhiltsov-max commented 1 year ago

Hi, please make sure the uploaded annotations and datasets use the file layouts described here.

GitarthVaishnav commented 1 year ago

It does work with the exact "word to word" file structure and naming. Would it need exact naming convention? obj_train_data, and so on?

Isn't there a generalised file structure?

zhiltsov-max commented 1 year ago

Image file names can be arbitrary, given they are listed in the corresponding subset file (train.txt, valid.txt). Everything else should be the same as in the description. This file layout was borrowed from the YOLO training framework, meaning it should be ready-to-use in the practical use cases. Could you please specify which elements you would like to be optional in the file structure?

elhabchiali commented 1 year ago

If anyone has the same problem, i was able to solve it for the YOLO case by including the images alongside the annotation files.

rafaelgildin commented 8 months ago

@elhabchiali did you export the images with yolo format from roboflow ? If not any advice on how to export roboflow dataset to cvat ? Thanks.

cvat-ai / cvat