Koed00 / django-q

A multiprocessing distributed task queue for Django
https://django-q.readthedocs.org
MIT License
1.83k stars 285 forks source link

I run the diango project in Docker and use django_ Q for queue processing. The Server configuration isGPU16G, but after 1.9g is used ,error occurred. How to solve this error? #704

Open randompaga opened 1 year ago

randompaga commented 1 year ago

Process Process-1:1: Traceback (most recent call last): File "/usr/local/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap self.run() File "/usr/local/lib/python3.8/multiprocessing/process.py", line 108, in run self._target(*self._args, *self._kwargs) File "/usr/local/lib/python3.8/site-packages/django_q/cluster.py", line 415, in worker for task in iter(task_queue.get, "STOP"): File "/usr/local/lib/python3.8/site-packages/django_q/queues.py", line 71, in get x = super(Queue, self).get(args, **kwargs) File "/usr/local/lib/python3.8/multiprocessing/queues.py", line 116, in get return _ForkingPickler.loads(res) File "/app/textimg/webui.py", line 677, in model, modelCS, modelFS, device, config = load_SD_model() File "/app/textimg/webui.py", line 647, in load_SD_model model.cuda() File "/usr/local/lib/python3.8/site-packages/pytorch_lightning/core/mixins/device_dtype_mixin.py", line 132, in cuda return super().cuda(device=device) File "/usr/local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 688, in cuda return self._apply(lambda t: t.cuda(device)) File "/usr/local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 578, in _apply module._apply(fn) File "/usr/local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 578, in _apply module._apply(fn) File "/usr/local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 578, in _apply module._apply(fn) [Previous line repeated 4 more times] File "/usr/local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 601, in _apply param_applied = fn(param) File "/usr/local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 688, in return self._apply(lambda t: t.cuda(device)) RuntimeError: CUDA out of memory. Tried to allocate 114.00 MiB (GPU 0; 15.75 GiB total capacity; 1.91 GiB already allocated; 12.62 MiB free; 1.97 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF 02:22:48 [Q] ERROR reincarnated worker Process-1:1 after death 02:22:49 [Q] INFO Process-1:4 ready for work at 158 Not Found: /

GDay commented 1 year ago

RuntimeError: CUDA out of memory. Tried to allocate 114.00 MiB (GPU 0; 15.75 GiB total capacity; 1.91 GiB already allocated; 12.62 MiB free; 1.97 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

It looks like your app is running out of memory (only 12 mb free). I don't think this is an issue with django q.

Maybe docker doesn't have enough memory.

randompaga commented 1 year ago

RuntimeError: CUDA out of memory. Tried to allocate 114.00 MiB (GPU 0; 15.75 GiB total capacity; 1.91 GiB already allocated; 12.62 MiB free; 1.97 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

It looks like your app is running out of memory (only 12 mb free). I don't think this is an issue with django q.

Maybe docker doesn't have enough memory.

System check identified no issues (0 silenced). Operations to perform: Apply all migrations: admin, auth, contenttypes, django_q, sessions Running migrations: No migrations to apply. November 13, 2022 - 12:02:47 Django version 3.2, using settings 'azz.settings' Starting development server at http://0.0.0.0:5608/ Quit the server with CONTROL-C. 12:02:50 [Q] INFO Q Cluster cat-kansas-triple-johnny starting. 12:02:50 [Q] INFO Process-1:1 ready for work at 206 12:02:50 [Q] INFO Process-1:2 monitoring at 207 12:02:50 [Q] INFO 206 will use cpu [0] 12:02:50 [Q] INFO Process-1 guarding cluster cat-kansas-triple-johnny 12:02:50 [Q] INFO Process-1:3 pushing tasks at 208 12:02:50 [Q] INFO Q Cluster cat-kansas-triple-johnny running. root@VM-0-3-ubuntu:/home/ubuntu#

12:02:50 [Q] INFO 206 will use cpu [0] Does django_Q support running on GPU? I see it running on the CPU. How can I change the configuration to run it on the GPU?

GDay commented 1 year ago

Nope, that's not supported at the moment.