Closed brainwater closed 5 months ago
@brainwater Just thought I'd say I saw your post. I'd been having the same problem. Seeing your solution made me want to try again. This time starting with a fresh pull it worked immediately, something I'd never seen before. Running under WSL2. Just a bit of fiddling with WSL2 to get the port working beyond the local machine. No other changes necessary. So I can't say the issue is closed as I don't see any updates in the repository, but it's definitely working for me now.
I did actually just rebuild the image and pushed it. It may have picked up some new stuff from the base image. I meant to post here but I forgot.
Actually it seems I was mistaken when I said it was working. I neglected to add --gpus all to the docker run command initially. So it was only operating in CPU mode. When I added it the YOLOv5 startup lists the GPU instead of the CPU but exits without an error.
YOLOv5 🚀 2024-1-1 Python-3.8.10 torch-2.1.2+cu121 CUDA:0 (NVIDIA GeForce GTX 970, 4096MiB)
So mine is probably a different issue at this point but I'd love to see it working with the current CUDA. Not sure how to proceed with troubleshooting.
Apologies if this is a newbie question, but what is the minimum required compute capability for running this? I didn't see anything listed. My test case is currently 5.2. If it needs significantly higher I may need to rethink my ideas. I'm using it in the context of Home Assistant and I'm not sure what would satisfy the requirements.
@keyboarderror honestly, I don't know what it requires... I don't do much with the Nvidia GPU side of things. I wouldn't think it requires very high as the model it uses is old but it's good.I really can't tell you to be sure.
This is the version the container has currently:
root@7ebde0b3c926:/opt/doods# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0
Okay, I tried building with updated tensorflow quite a few times but something was wrong with Docker hub. I just finally tried again and it took. Maybe try now. I believe this will have updated Cuda.
OK. It fails with or without enabling the GPU, but at least there's an error. It's the same trying to run on CPU. Doesn't appear to be a Cuda problem.
sudo docker run --gpus all -it -p 8080:8080 snowzach/doods2:amd64-gpu
Traceback (most recent call last): File "/opt/doods/main.py", line 5, in <module> from api import API File "/opt/doods/api.py", line 8, in <module> from fastapi import status, FastAPI, WebSocket, WebSocketDisconnect File "/usr/local/lib/python3.11/dist-packages/fastapi/__init__.py", line 7, in <module> from .applications import FastAPI as FastAPI File "/usr/local/lib/python3.11/dist-packages/fastapi/applications.py", line 3, in <module> from fastapi import routing File "/usr/local/lib/python3.11/dist-packages/fastapi/routing.py", line 22, in <module> from fastapi.dependencies.models import Dependant File "/usr/local/lib/python3.11/dist-packages/fastapi/dependencies/models.py", line 3, in <module> from fastapi.security.base import SecurityBase File "/usr/local/lib/python3.11/dist-packages/fastapi/security/__init__.py", line 1, in <module> from .api_key import APIKeyCookie as APIKeyCookie File "/usr/local/lib/python3.11/dist-packages/fastapi/security/api_key.py", line 3, in <module> from fastapi.openapi.models import APIKey, APIKeyIn File "/usr/local/lib/python3.11/dist-packages/fastapi/openapi/models.py", line 103, in <module> class Schema(BaseModel): File "/usr/local/lib/python3.11/dist-packages/pydantic/main.py", line 369, in __new__ cls.__signature__ = ClassAttribute('__signature__', generate_model_signature(cls.__init__, fields, config))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/pydantic/utils.py", line 231, in generate_model_signature merged_params[param_name] = Parameter( ^^^^^^^^^^ File "/usr/lib/python3.11/inspect.py", line 2715, in __init__ raise ValueError('{!r} is not a valid parameter name'.format(name)) ValueError: 'not' is not a valid parameter name
I think the fix is to update pydantic within requirements.txt
I'm getting the same issue, ValueError: 'not' is not a valid parameter name
.
Here is a comment about it https://github.com/tiangolo/fastapi/issues/5048#issuecomment-1170204100
The issue looks like it's occurring on line 231 of pydantic/utils.py https://github.com/pydantic/pydantic/blob/v1.8.2/pydantic/utils.py#L231
Pydantic is pinned at an old (1.8.2) version within requirements.txt.
The problem was identified in pydantic as of April of 2022, and a fix was merged into pydantic in August of 2022. Pydantic v1.8.2 is from 3 years ago, it doesn't have that fix, so updating pydantic to a recent version should fix the issue.
$ sudo docker run -it -p 8080:8080 --gpus all snowzach/doods2:amd64-gpu
Traceback (most recent call last):
File "/opt/doods/main.py", line 5, in <module>
from api import API
File "/opt/doods/api.py", line 8, in <module>
from fastapi import status, FastAPI, WebSocket, WebSocketDisconnect
File "/usr/local/lib/python3.11/dist-packages/fastapi/__init__.py", line 7, in <module>
from .applications import FastAPI as FastAPI
File "/usr/local/lib/python3.11/dist-packages/fastapi/applications.py", line 3, in <module>
from fastapi import routing
File "/usr/local/lib/python3.11/dist-packages/fastapi/routing.py", line 22, in <module>
from fastapi.dependencies.models import Dependant
File "/usr/local/lib/python3.11/dist-packages/fastapi/dependencies/models.py", line 3, in <module>
from fastapi.security.base import SecurityBase
File "/usr/local/lib/python3.11/dist-packages/fastapi/security/__init__.py", line 1, in <module>
from .api_key import APIKeyCookie as APIKeyCookie
File "/usr/local/lib/python3.11/dist-packages/fastapi/security/api_key.py", line 3, in <module>
from fastapi.openapi.models import APIKey, APIKeyIn
File "/usr/local/lib/python3.11/dist-packages/fastapi/openapi/models.py", line 103, in <module>
class Schema(BaseModel):
File "/usr/local/lib/python3.11/dist-packages/pydantic/main.py", line 369, in __new__
cls.__signature__ = ClassAttribute('__signature__', generate_model_signature(cls.__init__, fields, config))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/pydantic/utils.py", line 231, in generate_model_signature
merged_params[param_name] = Parameter(
^^^^^^^^^^
File "/usr/lib/python3.11/inspect.py", line 2715, in __init__
raise ValueError('{!r} is not a valid parameter name'.format(name))
ValueError: 'not' is not a valid parameter name
$
Okay, I just updated everything to tensorflow 2.14 which should have updated the CUDA version. Try it now.
It's back to exiting without any errors. CPU mode works.
It's back to exiting without any errors. CPU mode works.
But GPU does not?
No. It just returns to the command prompt a couple moments after the message Fusing layers... No error message. In CPU mode it starts showing server messages and servicing requests.
I'm getting the same using the gpu
blake@srv-docker:~$ sudo docker pull snowzach/doods2:amd64-gpu
<snipped>
Digest: sha256:d439e0c4d43d50d023fae5e8f3056ad20c68e086ccdfd1d61d6201ee8df843fa
Status: Downloaded newer image for snowzach/doods2:amd64-gpu
docker.io/snowzach/doods2:amd64-gpu
blake@srv-docker:~$ sudo docker run -it -p 8080:8080 --gpus all snowzach/doods2:amd64-gpu
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
2024-01-30 16:28:01,935 - doods.doods - INFO - Registered detector type:tflite name:default
2024-01-30 16:28:03,518 - doods.doods - INFO - Registered detector type:tensorflow name:tensorflow
/usr/local/lib/python3.11/dist-packages/torch/hub.py:294: UserWarning: You are about to download and run code from an untrusted repository. In a future release, this won't be allowed. To add the repository to your trusted list, change the command to {calling_fn}(..., trust_repo=False) and a command prompt will appear asking for an explicit confirmation of trust, or load(..., trust_repo=True), which will assume that the prompt is to be answered with 'yes'. You can also use load(..., trust_repo='check') which will only prompt for confirmation if the repo is not already trusted. This will eventually be the default behaviour
warnings.warn(
Downloading: "https://github.com/ultralytics/yolov5/zipball/master" to /root/.cache/torch/hub/master.zip
YOLOv5 🚀 2024-1-30 Python-3.11.0rc1 torch-2.1.2+cu121 CUDA:0 (NVIDIA GeForce GTX 1080, 8112MiB)
Downloading https://github.com/ultralytics/yolov5/releases/download/v7.0/yolov5s.pt to yolov5s.pt...
100%|█████████████████████████████████████████████████████████████████| 14.1M/14.1M [00:00<00:00, 62.9MB/s]
Fusing layers...
blake@srv-docker:~$
Tonight I'll see if I can debug it to get more details on exactly where it had an error.
The problem is due to out-of-date apt packages. I can work around the issue by running an apt update && apt upgrade -y
within the container before running doods2.
There's 58 out-of-date apt packages, and 67 out-of-date pip packages.
blake@srv-docker:~$ sudo docker run --entrypoint=bash -it -p 8081:8080 --gpus all snowzach/doods2:amd64-gpu
<snipped>
root@e915b583aeee:/opt/doods# apt update
<snipped>
root@e915b583aeee:/opt/doods# apt upgrade
<snipped>
root@e915b583aeee:/opt/doods# python3 main.py api
<doods2 is now running>
Confirmed that fixes it here too. Excellent. And thanks @brainwater for the --entrypoint=bash switch. I'm still pretty new to docker and couldn't figure out how to get a persistent shell if the container didn't want to run. Now I can poke around.
Awesome! Thanks for tracking that down. I updated the Docker builds and pushed everything out. I even dug out my GTX970 and verified it runs now. Closing this issue., LMK if still problems.
It's working for me now. Thanks for your work @snowzach !
Yes, I pulled the update and it works immediately. Thank you very much @snowzach!
The image snowzach/doods2:amd64-gpu is out of date and I believe isn't compatible with cuda 12.2
When running
docker run -it -p 8080:8080 --gpu all snowzach/doods2:amd64-gpu
I got the following error:The image
snowzach/doods2:amd64
worked fine on the same machine. This is a new installation of ubuntu server 22.04, with docker-engine installed via the instructions on the Docker website (i.e. not the ubuntu docker snap, since the snap is not compatible with gpu acceleration of containers).I ran the following from within a container using the base image
snowzach/doods2:amd64
:At this point, I tested it and it ran much better and faster, presumably indicating it successfully used the GPU.
I assume the image would work if the image build process were run again, but I was unable to find any instructions on the build process. I'd also appreciate instructions on building doods2 images locally.
nvidia-smi output: