Open haiminh2001 opened 6 months ago
What do you have in these json files? Are they big?
Label Studio Community doesn't have background workers and all background processes are running on wsgi web workers, so time is limited to 90 seconds for them.
What do you have in these json files? Are they big?
Label Studio Community doesn't have background workers and all background processes are running on wsgi web workers, so time is limited to 90 seconds for them.
Hi @makseq , thank you for your fast response. The json files are small. They only contains the URI links of the images and the classification labels.
You can try setting
UWSGI_WORKER_HARAKIRI=0
to avoid timeout.
harakiri A feature of uWSGI that aborts workers that are serving requests for an excessively long time. Configured using the harakiri family of options. Every request that will take longer than the seconds specified in the harakiri timeout will be dropped and the corresponding worker recycled.
Hmm so this means it will disable not only syncing but every other requests timeout, won't it ? If so it is a little bit dangerous. As I mention, I have the resources to scale up (CPU and memory), can scaling up be the solution?
Update: After setting the UWSGI_WORKER_HARAKIRI = 0 in the environment variables, I still get the Gateway Timeout. Is it supposed to be set in the environment variables ?
Traceback (most recent call last):
File "/label-studio/label_studio/./io_storages/base_models.py", line 456, in sync
import_sync_background(self.__class__, self.id)
File "/label-studio/label_studio/./io_storages/base_models.py", line 485, in import_sync_background
storage.scan_and_create_links()
File "/label-studio/label_studio/./io_storages/s3/models.py", line 148, in scan_and_create_links
return self._scan_and_create_links(S3ImportStorageLink)
File "/label-studio/label_studio/./io_storages/base_models.py", line 364, in _scan_and_create_links
self.info_set_in_progress()
File "/label-studio/label_studio/./io_storages/base_models.py", line 85, in info_set_in_progress
raise ValueError(f'Storage status ({self.status}) must be QUEUED to move it IN_PROGRESS')
ValueError: Storage status (initialized) must be QUEUED to move it IN_PROGRESS
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/rest_framework/views.py", line 506, in dispatch
response = handler(request, *args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/django/utils/decorators.py", line 43, in _wrapper
return bound_method(*args, **kwargs)
File "/label-studio/label_studio/./io_storages/api.py", line 110, in post
storage.sync()
File "/label-studio/label_studio/./io_storages/base_models.py", line 458, in sync
storage_background_failure(self)
File "/label-studio/label_studio/./io_storages/base_models.py", line 515, in storage_background_failure
storage.info_set_failed()
File "/label-studio/label_studio/./io_storages/base_models.py", line 117, in info_set_failed
self.meta['duration'] = (time_failure - self.time_in_progress).total_seconds()
File "/label-studio/label_studio/./io_storages/base_models.py", line 96, in time_in_progress
return datetime.fromisoformat(self.meta['time_in_progress'])
KeyError: 'time_in_progress'
I just ran into the "time in progress" error while syncing again, here is the stack trace, hope it will help.
Hmm so this means it will disable not only syncing but every other requests timeout, won't it ? If so it is a little bit dangerous. As I mention, I have the resources to scale up (CPU and memory), can scaling up be the solution?
Correct, it may eat all your resources.
Update: After setting the UWSGI_WORKER_HARAKIRI = 0 in the environment variables, I still get the Gateway Timeout. Is it supposed to be set in the environment variables ?
Probably you have some balancers like nginx and they throw timeouts.
@haiminh2001 were you able to get the syncing to external storage working without the gateway timeouts?
@haiminh2001 were you able to get the syncing to external storage working without the gateway timeouts?
No, I have not. Syncing storage is terribly slow that my approach is to keep my folders in minio storage having no more than 1000 tasks.
I ran into this problem about a year ago when my project grew to over a few thousand tasks. Here is how I increased the timeout.
In deploy/uwsgi.ini
, replace this line:
http-timeout = 300
with this,
if-env = UWSGI_HTTP_TIMEOUT
http-timeout = $(UWSGI_HTTP_TIMEOUT)
endif =
if-not-env = UWSGI_HTTP_TIMEOUT
http-timeout = 300
endif =
In deploy/default.conf
, set proxy_read_timeout
to whatever you want the timeout to be (for this example I'll use 180):
proxy_read_timeout 180;
Add the following lines to the docker-compose.yml
:
services:
nginx:
volumes:
- ./deploy/default.conf:/etc/nginx/nginx.conf
app:
environment:
- UWSGI_HTTP_TIMEOUT=180
- UWSGI_WORKER_HARAKIRI=181
volumes:
- ./deploy/uwsgi.ini:/label-studio/deploy/uwsgi.ini
Set UWSGI_HTTP_TIMEOUT
equal to proxy_read_timeout
in deploy/default.conf
, and set the UWSGI_WORKER_HARAKIRI
to whatever that number is plus 1. The changes should take effect the next time you run docker compose up
.
@WillieMaddox thank you so much. I'll give it a try.
Thank you very much @WillieMaddox , It worked well for me.
Describe the bug I deployed Label Studio on out internal K8s cluster using the official helm-chart (version 14.0.10). When I sync data storage (a minio cluster on the same k8s cluster) which contains about 1000 json file which represent 1000 tasks (the data is a uri to the minio cluster). After a while, the Gateway Timeout error and occasionally, "in time progress" error pop up, the sync request is not converted in "Failed" status but get stuck in "Queued" status and the project is completely frozen, I cannot do anything without syncing.
To Reproduce Steps to reproduce the behavior:
Expected behavior The sync is executed background, there is no need to immediately return the sync task's result.
Environment (please complete the following information):
Additional context This question may be out of context. But I am having issue with performance of Label Studio, the UI is quite lag and syncing data consistently gives me error as in this issue. How can I scale up my Label Studio on K8s and what should I scale up ? The hardware resource is not really my concern.