Closed VRichardJP closed 5 months ago
@VRichardJP
Thank you for the report. Could you please help us to fix the issue?
I can help to fix. I have tested the following 2 solutions:
pip install -r SiamMask/requirements.txt
by individual packages from https://github.com/foolwood/SiamMask/blob/master/requirements.txt but with torch==1.9.0
Both solutions fix the build:
$ ./serverless/deploy_gpu.sh serverless/pytorch/foolwood/siammask/
22.03.17 09:02:54.519 nuctl (I) Project created {"Name": "cvat", "Namespace": "nuclio"}
Deploying serverless/pytorch/foolwood/siammask function...
22.03.17 09:02:54.894 nuctl (I) Deploying function {"name": ""}
22.03.17 09:02:54.894 nuctl (I) Building {"builderKind": "docker", "versionInfo": "Label: 1.7.11, Git commit: afc97384b92e3dd2c75c9ec18b069cff986427e0, OS: linux, Arch: amd64, Go version: go1.17.5", "name": ""}
22.03.17 09:02:55.076 nuctl (I) Cleaning up before deployment {"functionName": "pth-foolwood-siammask"}
22.03.17 09:02:55.100 nuctl (I) Staging files and preparing base images
22.03.17 09:02:55.178 nuctl (W) Python 3.6 runtime is deprecated and will soon not be supported. Please migrate your code and use Python 3.7 runtime (`python:3.7`) or higher
22.03.17 09:02:55.178 nuctl (I) Building processor image {"registryURL": "", "imageName": "cvat/pth.foolwood.siammask:latest"}
22.03.17 09:02:55.178 nuctl.platform.docker (I) Pulling image {"imageName": "quay.io/nuclio/handler-builder-python-onbuild:1.7.11-amd64"}
22.03.17 09:03:04.446 nuctl.platform.docker (I) Pulling image {"imageName": "quay.io/nuclio/uhttpc:0.0.1-amd64"}
22.03.17 09:03:15.338 nuctl.platform (I) Building docker image {"image": "cvat/pth.foolwood.siammask:latest"}
22.03.17 09:03:15.953 nuctl.platform (I) Pushing docker image into registry {"image": "cvat/pth.foolwood.siammask:latest", "registry": ""}
22.03.17 09:03:15.953 nuctl.platform (I) Docker image was successfully built and pushed into docker registry {"image": "cvat/pth.foolwood.siammask:latest"}
22.03.17 09:03:15.953 nuctl (I) Build complete {"result": {"Image":"cvat/pth.foolwood.siammask:latest","UpdatedFunctionConfig":{"metadata":{"name":"pth-foolwood-siammask","namespace":"nuclio","labels":{"nuclio.io/project-name":"cvat"},"annotations":{"framework":"pytorch","name":"SiamMask","spec":"","type":"tracker"}},"spec":{"description":"Fast Online Object Tracking and Segmentation","handler":"main:handler","runtime":"python:3.6","env":[{"name":"PYTHONPATH","value":"/opt/nuclio/SiamMask:/opt/nuclio/SiamMask/experiments/siammask_sharp"}],"resources":{"limits":{"nvidia.com/gpu":"1"}},"image":"cvat/pth.foolwood.siammask:latest","targetCPU":75,"triggers":{"myHttpTrigger":{"class":"","kind":"http","name":"myHttpTrigger","maxWorkers":2,"workerAvailabilityTimeoutMilliseconds":10000,"attributes":{"maxRequestBodySize":33554432}}},"volumes":[{"volume":{"name":"volume-1","hostPath":{"path":"/home/vrichard/ML/cvat/serverless/common"}},"volumeMount":{"name":"volume-1","mountPath":"/opt/nuclio/common"}}],"build":{"functionConfigPath":"serverless/pytorch/foolwood/siammask//nuclio/function-gpu.yaml","image":"cvat/pth.foolwood.siammask","baseImage":"nvidia/cuda:11.1-devel-ubuntu20.04","directives":{"preCopy":[{"kind":"ENV","value":"PATH=\"/root/miniconda3/bin:${PATH}\""},{"kind":"ARG","value":"PATH=\"/root/miniconda3/bin:${PATH}\""},{"kind":"RUN","value":"apt update && apt install -y --no-install-recommends wget git ca-certificates libglib2.0-0 libsm6 libxrender1 libxext6 && rm -rf /var/lib/apt/lists/*"},{"kind":"RUN","value":"wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh && chmod +x Miniconda3-latest-Linux-x86_64.sh && ./Miniconda3-latest-Linux-x86_64.sh -b && rm -f Miniconda3-latest-Linux-x86_64.sh"},{"kind":"WORKDIR","value":"/opt/nuclio"},{"kind":"RUN","value":"conda create -y -n siammask python=3.7"},{"kind":"SHELL","value":"[\"conda\", \"run\", \"-n\", \"siammask\", \"/bin/bash\", \"-c\"]"},{"kind":"RUN","value":"git clone https://github.com/VRichardJP/SiamMask.git"},{"kind":"RUN","value":"pip install -r SiamMask/requirements.txt jsonpickle"},{"kind":"RUN","value":"pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html"},{"kind":"RUN","value":"conda install -y gcc_linux-64"},{"kind":"RUN","value":"cd SiamMask && bash make.sh && cd -"},{"kind":"RUN","value":"wget -P SiamMask/experiments/siammask_sharp http://www.robots.ox.ac.uk/~qwang/SiamMask_DAVIS.pth"},{"kind":"ENTRYPOINT","value":"[\"conda\", \"run\", \"-n\", \"siammask\"]"}]},"codeEntryType":"image"},"platform":{"attributes":{"mountMode":"volume","restartPolicy":{"maximumRetryCount":3,"name":"always"}}},"readinessTimeoutSeconds":60,"securityContext":{},"eventTimeout":"30s"}}}}
22.03.17 09:03:24.104 nuctl.platform (I) Waiting for function to be ready {"timeout": 60}
22.03.17 09:03:28.750 nuctl (I) Function deploy complete {"functionName": "pth-foolwood-siammask", "httpPort": 49160, "internalInvocationURLs": ["172.17.0.4:8080"], "externalInvocationURLs": []}
NAMESPACE | NAME | PROJECT | STATE | REPLICAS | NODE PORT
nuclio | openvino-dextr | cvat | ready | 1/1 | 49157
nuclio | pth-foolwood-siammask | cvat | ready | 1/1 | 49160
nuclio | pth-saic-vul-hrnet | cvat | ready | 1/1 | 49159
However it does not make SiamMask work in CVAT. I have tried several time to track an object with SiamMask. Whenever I jump to the next frame, I get a notification that the tracker is being initialized and then this error comes:
Tracking Error
TypeError: Cannot read properties of undefined (reading 'map')
The problem occurs both with CPU and GPU version.
From a quick investigation, the problem comes from here: https://github.com/openvinotoolkit/cvat/blob/93ccf2177f560d037fdfad732b98292abfec5944/cvat-ui/src/components/annotation-page/standard-workspace/controls-side-bar/tools-control.tsx#L737-L743
The response
is undefined
after the call, so the response.shapes.map()
raises an error which is caught here:
https://github.com/openvinotoolkit/cvat/blob/93ccf2177f560d037fdfad732b98292abfec5944/cvat-ui/src/components/annotation-page/standard-workspace/controls-side-bar/tools-control.tsx#L760-L764
Unfortunately, I am not familiar at all with the framework nor TS, so it is difficult for me to figure out exactly what is going on. If you can help it would be much appreciated.
@VRichardJP , I will vote for the first variant. Let's froze dependencies.
I'm seeing the same thing where the response from the request to invoke siammask is []
. @VRichardJP were you able to successfuly invoke the function, either via cvat or manually via nuctl invoke
? What nuclio version have you tried?
I have first tried with version 1.5.16, which is the version recommended in the documentation. Since I couldn't make it work, I tried to update to a newer version Currently I am using 1.7.11.
I have tried to test the function with nutcl invoke
, but I am not familiar with nuclio and don't know how to send the input data to the function, so I get this error:
$ nuctl invoke pth-foolwood-siammask
22.03.19 13:26:37.383 nuctl.platform.invoker (I) Executing function {"method": "GET", "url": "http://:49161", "bodyLength": 0, "headers": {"Content-Type":["text/plain"],"X-Nuclio-Log-Level":["info"],"X-Nuclio-Target":["pth-foolwood-siammask"]}}
22.03.19 13:26:38.767 nuctl.platform.invoker (I) Got response {"status": "500 Internal Server Error"}
22.03.19 13:26:38.767 nuctl (I) >>> Start of function logs
22.03.19 13:26:38.767 pth-foolwood-siammask (I) Run SiamMask model {"time": 1647663998491.4946, "worker_id": "0"}
22.03.19 13:26:38.768 pth-foolwood-siammask (E) Exception caught in handler {"exc": "'NoneType' object is not subscriptable", "traceback": "Traceback (most recent call last):\n File \"/opt/nuclio/_nuclio_wrapper.py\", line 118, in serve_requests\n await self._handle_event(event)\n File \"/opt/nuclio/_nuclio_wrapper.py\", line 312, in _handle_event\n entrypoint_output = self._entrypoint(self._context, event)\n File \"/opt/nuclio/main.py\", line 19, in handler\n buf = io.BytesIO(base64.b64decode(data[\"image\"]))\nTypeError: 'NoneType' object is not subscriptable\n", "worker_id": "0", "time": 1647663998749.482}
22.03.19 13:26:38.768 nuctl (I) <<< End of function logs
> Response headers:
Content-Type = text/plain
Content-Length = 497
Server = nuclio
Date = Sat, 19 Mar 2022 04:26:38 GMT
> Response body:
Exception caught in handler - "'NoneType' object is not subscriptable": Traceback (most recent call last):
File "/opt/nuclio/_nuclio_wrapper.py", line 118, in serve_requests
await self._handle_event(event)
File "/opt/nuclio/_nuclio_wrapper.py", line 312, in _handle_event
entrypoint_output = self._entrypoint(self._context, event)
File "/opt/nuclio/main.py", line 19, in handler
buf = io.BytesIO(base64.b64decode(data["image"]))
TypeError: 'NoneType' object is not subscriptable
@VRichardJP you can invoke it from the nuclio dashboard using the "test" feature, the easiest way to see what the payload needs to be is to inspect the network request coming from cvat in the browser dev tools. You'll see that it's a JSON object with shapes
and states
keys (and some metadata I think) what's confusing to me is that it seems like main.handler
is getting the image information from the context instead of the event but I haven't figured out how that's being sent over. Perhaps it isn't being sent over and that's why the function is returning []
? At any rate, I'm not sure it makes sense to PR the changes if we can't invoke the function...
Any update on this? The ability to track objects across frames is pretty critical for my video annotation workflow.
I have been reading the chat and been testing it myself as well. With the new changes added to the PR I'm able to run the siammask cpu version with the command
serverless/deploy_cpu.sh serverless/pytorch/foolwood/siammask/
If I run the serverless/deploy_cpu.sh
command the script gets stuck without return any error. This should be adresses but I was unable to retrieve output from the script itself. To still get the gpu version up and running I adapted the command for deploying directly with nuctl from mask_rcnn to siammask gpu as follows
nuctl deploy --project-name cvat \
--path serverless/pytorch/foolwood/siammask/nuclio/ \
--platform local --base-image nvidia/cuda:11.1-devel-ubuntu20.04 \
--desc "GPU based implementation of SIAM mask on Python 3, pytorch." \
--image cvat/pth.foolwood.siammask \
--triggers '{"myHttpTrigger": {"maxWorkers": 1}}' \
--resource-limit nvidia.com/gpu=1
This command does not get stuck and is able to run on nuctl version 1.5.16 for the gpu. However, when running the tracker I am getting the 500 error, other people have been getting also. When invoking the nuctl function with nuctl invoke pth-foolwood-siammask
I get the same error as @VRichardJP.
> Response headers:
Content-Length = 491
Server = nuclio
Date = Tue, 05 Apr 2022 11:26:39 GMT
Content-Type = text/plain
> Response body:
Exception caught in handler - "'NoneType' object is not subscriptable": Traceback (most recent call last):
File "/opt/nuclio/_nuclio_wrapper.py", line 114, in serve_requests
self._handle_event(event)
File "/opt/nuclio/_nuclio_wrapper.py", line 262, in _handle_event
entrypoint_output = self._entrypoint(self._context, event)
File "/opt/nuclio/main.py", line 19, in handler
buf = io.BytesIO(base64.b64decode(data["image"]))
TypeError: 'NoneType' object is not subscriptable
Does anybody have more succes past this point? Or any pointers to where I should be adapting the code to properly load in the image needed in the buf variable? Or how I can debug/print the value of the event variable in nuclio docker?
To invoke the function from the nuclio cli you're going to need to pass it some input. The example in the docs shows how to pass input and the browser dev tools can show you what the input is supposed to look like. Having said that, I gave up temporarily trying to get Siammask working on GPU myself. If you do please report back!
On Tue, Apr 5, 2022, 4:37 AM casperthuis @.***> wrote:
I have been reading the chat and been testing it myself as well. With the new changes added to the PR I'm able to run the siammask cpu version with the command
serverless/deploy_cpu.sh serverless/pytorch/foolwood/siammask/
If I run the serverless/deploy_cpu.sh command the script gets stuck without return any error. This should be adresses but I was unable to retrieve output from the script itself. To still get the gpu version up and running I adapted the command for deploying directly with nuctl from mask_rcnn to siammask gpu as follows
nuctl deploy --project-name cvat \ --path serverless/pytorch/foolwood/siammask/nuclio/ \ --platform local --base-image nvidia/cuda:11.1-devel-ubuntu20.04 \ --desc "GPU based implementation of SIAM mask on Python 3, pytorch." \ --image cvat/pth.foolwood.siammask \ --triggers '{"myHttpTrigger": {"maxWorkers": 1}}' \ --resource-limit nvidia.com/gpu=1
This command does not get stuck and is able to run on nuctl version 1.5.16 for the gpu. However, when running the tracker I am getting the 500 error, other people have been getting also. When invoking the nuctl function with nuctl invoke pth-foolwood-siammask I get the same error as @VRichardJP https://github.com/VRichardJP.
Response headers: Content-Length = 491 Server = nuclio Date = Tue, 05 Apr 2022 11:26:39 GMT Content-Type = text/plain
Response body: Exception caught in handler - "'NoneType' object is not subscriptable": Traceback (most recent call last): File "/opt/nuclio/_nuclio_wrapper.py", line 114, in serve_requests self._handle_event(event) File "/opt/nuclio/_nuclio_wrapper.py", line 262, in _handle_event entrypoint_output = self._entrypoint(self._context, event) File "/opt/nuclio/main.py", line 19, in handler buf = io.BytesIO(base64.b64decode(data["image"])) TypeError: 'NoneType' object is not subscriptable
Does anybody have more succes past time point? Or any pointers to where I should be adapting the code to properly load in the image needed in the buf variable?
— Reply to this email directly, view it on GitHub https://github.com/openvinotoolkit/cvat/issues/4475#issuecomment-1088597449, or unsubscribe https://github.com/notifications/unsubscribe-auth/AECDZPMJOROF6UD7ADG4KWTVDQQYVANCNFSM5Q2VBTAA . You are receiving this because you commented.Message ID: @.***>
@ztyree42 I managed to get the GPU version working by help of #3059 and some trial and error.
The main issues that I faced was that the serverless/deploy_gpu.py
function seems to get stuck most of the times. After a couple of tried it is able to finish building, but I was only able to build it with the following command.
nuctl -v deploy --project-name cvat \
--path serverless/pytorch/foolwood/siammask/nuclio/ \
--platform local --base-image nvidia/cuda:11.1-devel-ubuntu20.04 \
--desc "GPU based implementation of siammask on Python 3, pytorch." \
--image cvat/pth.foolwood.siammask \
--triggers '{"myHttpTrigger": {"maxWorkers": 1}}' \
--resource-limit nvidia.com/gpu=1
Note that to run this command one would need to replace function.py
by function-gpu.py
.
Also note the -v
for the verbose option to keep track of what it is actually doing. I do not like this approach and would like to use the command serverless/deploy_gpy.py
function. if somebody know why this is happening would love to hear.
Next to that the issue #3059 describes that the docker update changes something about the porting and therefore the nuclio version 1.5.6 handler
function in main.py
is getting the wrong input(or something like that). This is the reason that the error message states that the input is of nonetype. Changing the the nuclio version did work for me, however it is currently not running smoothly. When tracking at the moment, my whole system is having a hard time and the actual result is slower than when running on my cpu. I will check if it is possible to allocate more gpu/memory or cpu capacity to the docker and whether that would solve the lagging issue.
@ztyree42, I tried but got following error: Reading package lists... W: GPG error: https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64 InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY A4B46 9963BF863CC E: The repository 'https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64 InRelease' is not signed.
Closed as outdated
My actions before raising this issue
Expected Behaviour
The command
./serverless/deploy_cpu.sh ./serverless/pytorch/foolwood/siammask/
or./serverless/deploy_gpu.sh ./serverless/pytorch/foolwood/siammask/
successfully deploy siammask networkCurrent Behaviour
The build fails because of this line: (https://github.com/openvinotoolkit/cvat/blob/be334fdee95563b54c011290ffa6b4bbf9fd4296/serverless/pytorch/foolwood/siammask/nuclio/function.yaml#L44) The
requirements.txt
file requirestorch==0.4.1
, but the package does not exist anymore.Full logs:
Possible Solution
Possible solutions:
./serverless/pytorch/foolwood/siammask/
. Example: https://github.com/VRichardJP/cvatpip install -r SiamMask/requirements.txt
line infunction.yaml
andfunction-gpu.yaml
by the list of required packagesSteps to Reproduce (for bugs)
Run
./serverless/deploy_cpu.sh ./serverless/pytorch/foolwood/siammask/
Context
Your Environment
git log -1
): be334fdee95563b54c011290ffa6b4bbf9fd4296docker version
(e.g. Docker 17.0.05): 20.10.9Logs from `cvat` container
Next steps
You may join our Gitter channel for community support.