cvat-ai / cvat

Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.
https://cvat.ai
MIT License
12.62k stars 3.01k forks source link

Modifying cvat_worker_export not reflecting in dataset export #8364

Closed ilyasofficial1617 closed 2 months ago

ilyasofficial1617 commented 2 months ago

Actions before raising this issue

Steps to Reproduce

  1. Made the following test changes to check if the directory is valid and can later be split by rearranging the files and folders

    # Test code
    test_name = 'test.txt'
    test_content = "asdfasdffsadfsad"
    test_path = os.path.join(cache_dir, test_name)
    with open(test_path, 'w') as file:
    file.write(test_content)

    Screenshot from 2024-08-28 11-10-27 Screenshot from 2024-08-28 11-11-08

  2. Deployed the changes using docker compose -f docker-compose.yml -f docker-compose.dev.yml up -d --build

Expected Behavior

The downloaded ZIP file should contain test.txt to confirm the changes are being applied.

Possible Solution

I suspect the problem might be related to an improper Docker deployment command or cached files not being properly updated. Here’s what I've done so far:

  1. Rebuilt all Docker images and ensured the changes were deployed using the provided command.
  2. Cleared out all existing Docker containers, images, and volumes to start from a clean state. However, the modifications still do not appear in the exported dataset. I would appreciate any advice on how to troubleshoot this or if there are specific deployment steps I might be missing.

Context

Hi, I am attempting to modify the cvat_worker_export function to autosplit all data into 80/10/10 (training/test/validation). However, changes to the code do not seem to affect the output of the exported dataset.

Environment

OS: Ubuntu 22.04
GPU: GeForce RTX 3060
Branch: develop
Docker Version: 24.0.7
Docker Logs (cvat_worker_export):

2024-08-28 04:29:04,489 DEBG 'rqworker-export-1' stderr output:
[2024-08-28 04:29:04,489] DEBUG rq.worker: Sent heartbeat to prevent worker timeout. Next one should arrive in 480 seconds.

2024-08-28 04:29:04,489 DEBG 'rqworker-export-1' stderr output:
[2024-08-28 04:29:04,489] DEBUG rq.worker: Dequeueing jobs on queues export and timeout 405

2024-08-28 04:29:04,690 DEBG 'rqworker-export-0' stderr output:
[2024-08-28 04:29:04,690] DEBUG rq.worker: Sent heartbeat to prevent worker timeout. Next one should arrive in 480 seconds.

2024-08-28 04:29:04,690 DEBG 'rqworker-export-0' stderr output:
[2024-08-28 04:29:04,690] DEBUG rq.worker: Dequeueing jobs on queues export and timeout 405

2024-08-28 04:33:02,111 DEBG 'rqworker-export-1' stderr output:
[2024-08-28 04:33:02,111] DEBUG rq.worker: Dequeued job export:task-1-dataset-in-YOLOv8_Segmentation_1@0-format-by-1 from export

2024-08-28 04:33:02,111 DEBG 'rqworker-export-1' stderr output:
[2024-08-28 04:33:02,111] INFO rq.worker: export: cvat.apps.dataset_manager.views.export_task_as_dataset(1, 'YOLOv8 Segmentation 1.0', 'http://localhost:8080') (export:task-1-dataset-in-YOLOv8_Segmentation_1@0-format-by-1)

2024-08-28 04:33:02,112 DEBG 'rqworker-export-1' stderr output:
[2024-08-28 04:33:02,111] DEBUG rq.worker: Sent heartbeat to prevent worker timeout. Next one should arrive in 480 seconds.

2024-08-28 04:33:02,117 DEBG 'rqworker-export-1' stderr output:
[2024-08-28 04:33:02,117] DEBUG rq.worker: Started Job Registry set.

2024-08-28 04:33:02,117 DEBG 'rqworker-export-1' stderr output:
[2024-08-28 04:33:02,117] DEBUG rq.worker: Preparing for execution of Job ID export:task-1-dataset-in-YOLOv8_Segmentation_1@0-format-by-1

2024-08-28 04:33:02,118 DEBG 'rqworker-export-1' stderr output:
[2024-08-28 04:33:02,118] DEBUG rq.worker: Sent heartbeat to prevent worker timeout. Next one should arrive in 90 seconds.

2024-08-28 04:33:02,119 DEBG 'rqworker-export-1' stderr output:
[2024-08-28 04:33:02,119] DEBUG rq.worker: Job preparation finished.

2024-08-28 04:33:02,119 DEBG 'rqworker-export-1' stderr output:
[2024-08-28 04:33:02,119] DEBUG rq.worker: Performing Job...

2024-08-28 04:33:02,132 DEBG 'rqworker-export-1' stderr output:
[2024-08-28 04:33:02,132] DEBUG rq.worker: Finished performing Job ID export:task-1-dataset-in-YOLOv8_Segmentation_1@0-format-by-1

2024-08-28 04:33:02,132 DEBG 'rqworker-export-1' stderr output:
[2024-08-28 04:33:02,132] DEBUG rq.worker: Handling successful execution of job export:task-1-dataset-in-YOLOv8_Segmentation_1@0-format-by-1

2024-08-28 04:33:02,133 DEBG 'rqworker-export-1' stderr output:
[2024-08-28 04:33:02,133] DEBUG rq.worker: Saving job export:task-1-dataset-in-YOLOv8_Segmentation_1@0-format-by-1's successful execution result

2024-08-28 04:33:02,133 DEBG 'rqworker-export-1' stderr output:
[2024-08-28 04:33:02,133] DEBUG rq.worker: Removing job export:task-1-dataset-in-YOLOv8_Segmentation_1@0-format-by-1 from StartedJobRegistry

2024-08-28 04:33:02,134 DEBG 'rqworker-export-1' stderr output:
[2024-08-28 04:33:02,134] DEBUG rq.worker: Finished handling successful execution of job export:task-1-dataset-in-YOLOv8_Segmentation_1@0-format-by-1

2024-08-28 04:33:02,134 DEBG 'rqworker-export-1' stderr output:
[2024-08-28 04:33:02,134] INFO rq.worker: export: Job OK (export:task-1-dataset-in-YOLOv8_Segmentation_1@0-format-by-1)

2024-08-28 04:33:02,134 DEBG 'rqworker-export-1' stderr output:
[2024-08-28 04:33:02,134] DEBUG rq.worker: Result: '/home/django/data/tasks/1/export_cache/dataset-instance1724328745.996826-yolov8-_segmentation-10.ZIP'

2024-08-28 04:33:02,134 DEBG 'rqworker-export-1' stderr output:
[2024-08-28 04:33:02,134] INFO rq.worker: Result is kept for 86400 seconds

2024-08-28 04:33:02,141 DEBG 'rqworker-export-1' stderr output:
[2024-08-28 04:33:02,141] DEBUG rq.worker: Sent heartbeat to prevent worker timeout. Next one should arrive in 480 seconds.

2024-08-28 04:33:02,141 DEBG 'rqworker-export-1
ilyasofficial1617 commented 2 months ago

@Gaurav9812 i think that's a malware

azhavoro commented 2 months ago

Definitely, the command you mention is absolutely correct. Could you please verify how we can help you with this issue?

Anyway, I don't think it's a bug, and would suggest debugging your code before build images: https://docs.cvat.ai/docs/contributing/development-environment/

ilyasofficial1617 commented 2 months ago

Thankyou for your guidance. Currently i followed the instruction in https://docs.cvat.ai/docs/contributing/development-environment/ but i encounter error, at this step : pip install -r cvat/requirements/development.txt log :

Collecting datumaro@ git+https://github.com/cvat-ai/datumaro.git@125840fc6b28875cce4c85626a5c36bb9e0d2a83 (from -r cvat/requirements/base.txt (line 60))
  Cloning https://github.com/cvat-ai/datumaro.git (to revision 125840fc6b28875cce4c85626a5c36bb9e0d2a83) to /tmp/pip-install-nbnwll_r/datumaro_569af9dfb211469bb4058e328300e276
  Running command git clone --filter=blob:none --quiet https://github.com/cvat-ai/datumaro.git /tmp/pip-install-nbnwll_r/datumaro_569af9dfb211469bb4058e328300e276
  Running command git rev-parse -q --verify 'sha^125840fc6b28875cce4c85626a5c36bb9e0d2a83'
  Running command git fetch -q https://github.com/cvat-ai/datumaro.git 125840fc6b28875cce4c85626a5c36bb9e0d2a83
  Running command git checkout -q 125840fc6b28875cce4c85626a5c36bb9e0d2a83
  Resolved https://github.com/cvat-ai/datumaro.git to commit 125840fc6b28875cce4c85626a5c36bb9e0d2a83
  Running command git submodule update --init --recursive -q
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Collecting av==9.2.0 (from -r cvat/requirements/../../utils/dataset_manifest/requirements.txt (line 8))
  Using cached av-9.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.5 kB)
Collecting natsort==8.0.0 (from -r cvat/requirements/../../utils/dataset_manifest/requirements.txt (line 10))
  Using cached natsort-8.0.0-py3-none-any.whl.metadata (21 kB)
Collecting numpy==1.22.4 (from -r cvat/requirements/../../utils/dataset_manifest/requirements.txt (line 12))
  Using cached numpy-1.22.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (2.0 kB)
Collecting opencv-python-headless==4.10.0.84 (from -r cvat/requirements/../../utils/dataset_manifest/requirements.txt (line 14))
  Using cached opencv_python_headless-4.10.0.84-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (20 kB)
Collecting pillow==10.4.0 (from -r cvat/requirements/../../utils/dataset_manifest/requirements.txt (line 16))
  Using cached pillow-10.4.0-cp310-cp310-manylinux_2_28_x86_64.whl.metadata (9.2 kB)
Collecting tqdm==4.66.5 (from -r cvat/requirements/../../utils/dataset_manifest/requirements.txt (line 18))
  Using cached tqdm-4.66.5-py3-none-any.whl.metadata (57 kB)
Collecting asgiref==3.8.1 (from -r cvat/requirements/base.txt (line 10))
  Using cached asgiref-3.8.1-py3-none-any.whl.metadata (9.3 kB)
Collecting async-timeout==4.0.3 (from -r cvat/requirements/base.txt (line 12))
  Using cached async_timeout-4.0.3-py3-none-any.whl.metadata (4.2 kB)
Collecting attrs==21.4.0 (from -r cvat/requirements/base.txt (line 14))
  Using cached attrs-21.4.0-py2.py3-none-any.whl.metadata (9.8 kB)
Collecting azure-core==1.30.2 (from -r cvat/requirements/base.txt (line 19))
  Using cached azure_core-1.30.2-py3-none-any.whl.metadata (37 kB)
Collecting azure-storage-blob==12.13.0 (from -r cvat/requirements/base.txt (line 23))
  Using cached azure_storage_blob-12.13.0-py3-none-any.whl.metadata (25 kB)
Collecting boto3==1.17.61 (from -r cvat/requirements/base.txt (line 25))
  Using cached boto3-1.17.61-py2.py3-none-any.whl.metadata (6.2 kB)
Collecting botocore==1.20.112 (from -r cvat/requirements/base.txt (line 27))
  Using cached botocore-1.20.112-py2.py3-none-any.whl.metadata (5.6 kB)
Collecting cachetools==5.5.0 (from -r cvat/requirements/base.txt (line 31))
  Using cached cachetools-5.5.0-py3-none-any.whl.metadata (5.3 kB)
Collecting certifi==2024.7.4 (from -r cvat/requirements/base.txt (line 33))
  Using cached certifi-2024.7.4-py3-none-any.whl.metadata (2.2 kB)
Collecting cffi==1.17.0 (from -r cvat/requirements/base.txt (line 38))
  Using cached cffi-1.17.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting charset-normalizer==3.3.2 (from -r cvat/requirements/base.txt (line 40))
  Using cached charset_normalizer-3.3.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (33 kB)
Collecting click==8.1.7 (from -r cvat/requirements/base.txt (line 42))
  Using cached click-8.1.7-py3-none-any.whl.metadata (3.0 kB)
Collecting clickhouse-connect==0.6.8 (from -r cvat/requirements/base.txt (line 44))
  Using cached clickhouse_connect-0.6.8-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (2.8 kB)
Collecting contourpy==1.2.1 (from -r cvat/requirements/base.txt (line 46))
  Using cached contourpy-1.2.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (5.8 kB)
Collecting coreapi==2.3.3 (from -r cvat/requirements/base.txt (line 48))
  Using cached coreapi-2.3.3-py2.py3-none-any.whl.metadata (1.0 kB)
Collecting coreschema==0.0.4 (from -r cvat/requirements/base.txt (line 50))
  Using cached coreschema-0.0.4.tar.gz (10 kB)
  Preparing metadata (setup.py) ... done
Collecting crontab==1.0.1 (from -r cvat/requirements/base.txt (line 52))
  Using cached crontab-1.0.1.tar.gz (19 kB)
  Preparing metadata (setup.py) ... done
Collecting cryptography==43.0.0 (from -r cvat/requirements/base.txt (line 54))
  Using cached cryptography-43.0.0-cp39-abi3-manylinux_2_28_x86_64.whl.metadata (5.4 kB)
Collecting cycler==0.12.1 (from -r cvat/requirements/base.txt (line 58))
  Using cached cycler-0.12.1-py3-none-any.whl.metadata (3.8 kB)
Collecting defusedxml==0.7.1 (from -r cvat/requirements/base.txt (line 62))
  Using cached defusedxml-0.7.1-py2.py3-none-any.whl.metadata (32 kB)
Collecting deprecated==1.2.14 (from -r cvat/requirements/base.txt (line 66))
  Using cached Deprecated-1.2.14-py2.py3-none-any.whl.metadata (5.4 kB)
Collecting dj-pagination==2.5.0 (from -r cvat/requirements/base.txt (line 68))
  Using cached dj_pagination-2.5.0-py3-none-any.whl.metadata (2.7 kB)
Collecting dj-rest-auth==5.0.2 (from dj-rest-auth[with-social]==5.0.2->-r cvat/requirements/base.txt (line 70))
  Using cached dj-rest-auth-5.0.2.tar.gz (217 kB)
  Preparing metadata (setup.py) ... done
Collecting django==4.2.15 (from -r cvat/requirements/base.txt (line 72))
  Using cached Django-4.2.15-py3-none-any.whl.metadata (4.1 kB)
Collecting django-allauth==0.57.2 (from django-allauth[saml]==0.57.2->-r cvat/requirements/base.txt (line 87))
  Using cached django-allauth-0.57.2.tar.gz (858 kB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Collecting django-appconf==1.0.6 (from -r cvat/requirements/base.txt (line 91))
  Using cached django_appconf-1.0.6-py3-none-any.whl.metadata (5.4 kB)
Collecting django-auth-ldap==2.2.0 (from -r cvat/requirements/base.txt (line 93))
  Using cached django_auth_ldap-2.2.0-py3-none-any.whl.metadata (7.2 kB)
Collecting django-compressor==4.3.1 (from -r cvat/requirements/base.txt (line 95))
  Using cached django_compressor-4.3.1-py2.py3-none-any.whl.metadata (5.0 kB)
Collecting django-cors-headers==3.5.0 (from -r cvat/requirements/base.txt (line 97))
  Using cached django_cors_headers-3.5.0-py3-none-any.whl.metadata (14 kB)
Collecting django-crum==0.7.9 (from -r cvat/requirements/base.txt (line 99))
  Using cached django_crum-0.7.9-py2.py3-none-any.whl.metadata (3.6 kB)
Collecting django-filter==2.4.0 (from -r cvat/requirements/base.txt (line 101))
  Using cached django_filter-2.4.0-py3-none-any.whl.metadata (4.1 kB)
Collecting django-health-check==3.18.3 (from -r cvat/requirements/base.txt (line 103))
  Using cached django_health_check-3.18.3-py2.py3-none-any.whl.metadata (10 kB)
Collecting django-rq==2.8.1 (from -r cvat/requirements/base.txt (line 105))
  Using cached django_rq-2.8.1-py2.py3-none-any.whl.metadata (18 kB)
Collecting django-sendfile2==0.7.0 (from -r cvat/requirements/base.txt (line 107))
  Using cached django_sendfile2-0.7.0-py3-none-any.whl.metadata (3.1 kB)
Collecting djangorestframework==3.14.0 (from -r cvat/requirements/base.txt (line 109))
  Using cached djangorestframework-3.14.0-py3-none-any.whl.metadata (10 kB)
Collecting drf-spectacular==0.26.2 (from -r cvat/requirements/base.txt (line 114))
  Using cached drf_spectacular-0.26.2-py3-none-any.whl.metadata (13 kB)
Collecting easyprocess==1.1 (from -r cvat/requirements/base.txt (line 116))
  Using cached EasyProcess-1.1-py3-none-any.whl.metadata (855 bytes)
Collecting entrypoint2==1.1 (from -r cvat/requirements/base.txt (line 118))
  Using cached entrypoint2-1.1-py2.py3-none-any.whl.metadata (1.0 kB)
Collecting fonttools==4.53.1 (from -r cvat/requirements/base.txt (line 120))
  Using cached fonttools-4.53.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (162 kB)
Collecting freezegun==1.5.1 (from -r cvat/requirements/base.txt (line 122))
  Using cached freezegun-1.5.1-py3-none-any.whl.metadata (11 kB)
Collecting furl==2.1.0 (from -r cvat/requirements/base.txt (line 124))
  Using cached furl-2.1.0-py2.py3-none-any.whl.metadata (1.1 kB)
Collecting google-api-core==2.19.1 (from -r cvat/requirements/base.txt (line 126))
  Using cached google_api_core-2.19.1-py3-none-any.whl.metadata (2.7 kB)
Collecting google-auth==2.34.0 (from -r cvat/requirements/base.txt (line 130))
  Using cached google_auth-2.34.0-py2.py3-none-any.whl.metadata (4.7 kB)
Collecting google-cloud-core==2.4.1 (from -r cvat/requirements/base.txt (line 135))
  Using cached google_cloud_core-2.4.1-py2.py3-none-any.whl.metadata (2.7 kB)
Collecting google-cloud-storage==1.42.0 (from -r cvat/requirements/base.txt (line 137))
  Using cached google_cloud_storage-1.42.0-py2.py3-none-any.whl.metadata (5.5 kB)
Collecting google-crc32c==1.5.0 (from -r cvat/requirements/base.txt (line 139))
  Using cached google_crc32c-1.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (2.3 kB)
Collecting google-resumable-media==2.7.2 (from -r cvat/requirements/base.txt (line 141))
  Using cached google_resumable_media-2.7.2-py2.py3-none-any.whl.metadata (2.2 kB)
Collecting googleapis-common-protos==1.63.2 (from -r cvat/requirements/base.txt (line 143))
  Using cached googleapis_common_protos-1.63.2-py2.py3-none-any.whl.metadata (1.5 kB)
Collecting h5py==3.11.0 (from -r cvat/requirements/base.txt (line 145))
  Using cached h5py-3.11.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (2.5 kB)
Collecting idna==3.7 (from -r cvat/requirements/base.txt (line 147))
  Using cached idna-3.7-py3-none-any.whl.metadata (9.9 kB)
Collecting importlib-metadata==8.2.0 (from -r cvat/requirements/base.txt (line 149))
  Using cached importlib_metadata-8.2.0-py3-none-any.whl.metadata (4.7 kB)
Collecting importlib-resources==6.4.3 (from -r cvat/requirements/base.txt (line 151))
  Using cached importlib_resources-6.4.3-py3-none-any.whl.metadata (3.9 kB)
Collecting inflection==0.5.1 (from -r cvat/requirements/base.txt (line 153))
  Using cached inflection-0.5.1-py2.py3-none-any.whl.metadata (1.7 kB)
Collecting isodate==0.6.1 (from -r cvat/requirements/base.txt (line 155))
  Using cached isodate-0.6.1-py2.py3-none-any.whl.metadata (9.6 kB)
Collecting itypes==1.2.0 (from -r cvat/requirements/base.txt (line 159))
  Using cached itypes-1.2.0-py2.py3-none-any.whl.metadata (4.9 kB)
Collecting jinja2==3.1.4 (from -r cvat/requirements/base.txt (line 161))
  Using cached jinja2-3.1.4-py3-none-any.whl.metadata (2.6 kB)
Collecting jmespath==0.10.0 (from -r cvat/requirements/base.txt (line 163))
  Using cached jmespath-0.10.0-py2.py3-none-any.whl.metadata (8.0 kB)
Collecting jsonschema==4.17.3 (from -r cvat/requirements/base.txt (line 167))
  Using cached jsonschema-4.17.3-py3-none-any.whl.metadata (7.9 kB)
Collecting kiwisolver==1.4.5 (from -r cvat/requirements/base.txt (line 169))
  Using cached kiwisolver-1.4.5-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.metadata (6.4 kB)
Collecting limits==3.13.0 (from -r cvat/requirements/base.txt (line 171))
  Using cached limits-3.13.0-py3-none-any.whl.metadata (7.2 kB)
Collecting lxml==5.3.0 (from -r cvat/requirements/base.txt (line 173))
  Using cached lxml-5.3.0.tar.gz (3.7 MB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  error: subprocess-exited-with-error

  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [4 lines of output]
      <string>:67: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
      Building lxml version 5.3.0.
      Building without Cython.
      Error: Please make sure the libxml2 and libxslt development packages are installed.
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
azhavoro commented 2 months ago

@ilyasofficial1617

From the log you provided

Please make sure the libxml2 and libxslt development packages are installed.

azhavoro commented 2 months ago

Is there anything else I can help with?