blakeblackshear / frigate

NVR with realtime local object detection for IP cameras
https://frigate.video
MIT License
18.13k stars 1.65k forks source link

[Support]: TRT Development environment/devcontainer fails to setup with "pull access denied" error #7745

Open kevin-david opened 1 year ago

kevin-david commented 1 year ago

Describe the problem you are having

When open the devcontainer with target: devcontainer-trt - the build fails.

Using target: devcontainer works fine, but doesn't allow for TensorRT dev.

Not sure if this needs to be updated? https://github.com/blakeblackshear/frigate/blob/5658e5a4cc7376504af9de5e1eff178939a13e7f/docker-compose.yml#L15C1-L15C1

Version

dev as of this moment

docker-compose file

version: "3"
services:
  devcontainer:
    container_name: frigate-devcontainer
    privileged: true # this may not be necessary for all setups
    # add groups from host for render, plugdev, video
    group_add:
      - "109" # render
      - "110" # render
      - "44"  # video
      - "46"  # plugdev
      - "120"  # apex
      - "1004"  # apex
    shm_size: "768mb"
    build:
      context: .
      dockerfile: docker/main/Dockerfile
      # Use target devcontainer-trt for TensorRT dev
      target: devcontainer-trt
    deploy:
          resources:
              reservations:
                  devices:
                      - driver: nvidia
                        count: 1
                        capabilities: [gpu]
    environment:
      YOLO_MODELS: yolov7-320
      NVIDIA_VISIBLE_DEVICES: all
      NVIDIA_DRIVER_CAPABILITIES: compute,utility,video
    devices:
      - /dev/bus/usb:/dev/bus/usb
      # - /dev/dri:/dev/dri # for intel hwaccel, needs to be updated for your hardware
      - /dev/apex_0:/dev/apex_0
    volumes:
      - .:/workspace/frigate:cached
      - ./web/dist:/opt/frigate/web:cached
      - /etc/localtime:/etc/localtime:ro
      - ./config:/config
      - ./debug:/media/frigate
      # Create the trt-models folder using the documented method of generating TRT models
      - ./docker/trt-models:/trt-models
      - /dev/bus/usb:/dev/bus/usb
  mqtt:
    container_name: mqtt
    image: eclipse-mosquitto:1.6
    ports:
      - "1883:1883"

Relevant log output

[2023-09-09T21:51:52.514Z]  => [devcontainer] resolve image config for docker.io/docker/dockerfile:1  0.3s
 => CACHED [devcontainer] docker-image://docker.io/docker/dockerfile:1.4@  0.0s
[2023-09-09T21:51:52.638Z] [+] Building 0.6s (6/6)                                                         
 => [devcontainer internal] load build definition from Dockerfile-with-fe  0.0s
 => => transferring dockerfile: 12.29kB                                    0.0s
 => [devcontainer internal] load .dockerignore                             0.0s
[2023-09-09T21:51:52.639Z]  => => transferring context: 157B                                          0.0s
 => [devcontainer] resolve image config for docker.io/docker/dockerfile:1  0.3s
 => CACHED [devcontainer] docker-image://docker.io/docker/dockerfile:1.4@  0.0s
 => [devcontainer internal] load metadata for docker.io/library/dev_conta  0.0s
 => ERROR [devcontainer internal] load metadata for docker.io/library/dev  0.1s
[2023-09-09T21:51:52.695Z] [+] Building 0.6s (6/6) FINISHED                                                
 => [devcontainer internal] load build definition from Dockerfile-with-fe  0.0s
 => => transferring dockerfile: 12.29kB                                    0.0s
 => [devcontainer internal] load .dockerignore                             0.0s
 => => transferring context: 157B                                          0.0s
[2023-09-09T21:51:52.695Z]  => [devcontainer] resolve image config for docker.io/docker/dockerfile:1  0.3s
 => CACHED [devcontainer] docker-image://docker.io/docker/dockerfile:1.4@  0.0s
 => [devcontainer internal] load metadata for docker.io/library/dev_conta  0.0s
 => ERROR [devcontainer internal] load metadata for docker.io/library/dev  0.1s
------
 > [devcontainer internal] load metadata for docker.io/library/devcontainer-trt:latest:
------
failed to solve: pull access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed
[2023-09-09T21:51:52.700Z] Stop (770 ms): Run: docker compose --project-name frigate -f /home/kevin/dev/frigate/docker-compose.yml -f /tmp/devcontainercli-kevin/docker-compose/docker-compose.devcontainer.build-1694296311930.yml build
[2023-09-09T21:51:52.635Z] Error: Command failed: docker compose --project-name frigate -f /home/kevin/dev/frigate/docker-compose.yml -f /tmp/devcontainercli-kevin/docker-compose/docker-compose.devcontainer.build-1694296311930.yml build
[2023-09-09T21:51:52.635Z]     at Tw (/home/kevin/.vscode-remote-containers/dist/dev-containers-cli-0.309.0/dist/spec-node/devContainersSpecCLI.js:427:409)
[2023-09-09T21:51:52.636Z]     at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
[2023-09-09T21:51:52.636Z]     at async tAA (/home/kevin/.vscode-remote-containers/dist/dev-containers-cli-0.309.0/dist/spec-node/devContainersSpecCLI.js:427:2378)
[2023-09-09T21:51:52.636Z]     at async eAA (/home/kevin/.vscode-remote-containers/dist/dev-containers-cli-0.309.0/dist/spec-node/devContainersSpecCLI.js:409:3167)
[2023-09-09T21:51:52.636Z]     at async FAA (/home/kevin/.vscode-remote-containers/dist/dev-containers-cli-0.309.0/dist/spec-node/devContainersSpecCLI.js:479:3833)
[2023-09-09T21:51:52.636Z]     at async GC (/home/kevin/.vscode-remote-containers/dist/dev-containers-cli-0.309.0/dist/spec-node/devContainersSpecCLI.js:479:4775)
[2023-09-09T21:51:52.636Z]     at async VeA (/home/kevin/.vscode-remote-containers/dist/dev-containers-cli-0.309.0/dist/spec-node/devContainersSpecCLI.js:611:12240)
[2023-09-09T21:51:52.636Z]     at async WeA (/home/kevin/.vscode-remote-containers/dist/dev-containers-cli-0.309.0/dist/spec-node/devContainersSpecCLI.js:611:11981)
[2023-09-09T21:51:52.643Z] Stop (2604 ms): Run in Host: /home/kevin/.vscode-server/bin/8b617bd08fd9e3fc94d14adb8d358b56e3f72314/node /home/kevin/.vscode-remote-containers/dist/dev-containers-cli-0.309.0/dist/spec-node/devContainersSpecCLI.js up --container-session-data-folder /tmp/devcontainers-19b71912-5704-4da9-9e4b-0712e1337b211694296307737 --workspace-folder /home/kevin/dev/frigate --workspace-mount-consistency cached --id-label devcontainer.local_folder=/home/kevin/dev/frigate --id-label devcontainer.config_file=/home/kevin/dev/frigate/.devcontainer/devcontainer.json --log-level debug --log-format json --config /home/kevin/dev/frigate/.devcontainer/devcontainer.json --default-user-env-probe loginInteractiveShell --mount type=volume,source=vscode,target=/vscode,external=true --skip-post-create --update-remote-user-uid-default on --mount-workspace-git-root true
[2023-09-09T21:51:52.644Z] Exit code 1
[2023-09-09T21:51:52.648Z] Command failed: /home/kevin/.vscode-server/bin/8b617bd08fd9e3fc94d14adb8d358b56e3f72314/node /home/kevin/.vscode-remote-containers/dist/dev-containers-cli-0.309.0/dist/spec-node/devContainersSpecCLI.js up --container-session-data-folder /tmp/devcontainers-19b71912-5704-4da9-9e4b-0712e1337b211694296307737 --workspace-folder /home/kevin/dev/frigate --workspace-mount-consistency cached --id-label devcontainer.local_folder=/home/kevin/dev/frigate --id-label devcontainer.config_file=/home/kevin/dev/frigate/.devcontainer/devcontainer.json --log-level debug --log-format json --config /home/kevin/dev/frigate/.devcontainer/devcontainer.json --default-user-env-probe loginInteractiveShell --mount type=volume,source=vscode,target=/vscode,external=true --skip-post-create --update-remote-user-uid-default on --mount-workspace-git-root true
[2023-09-09T21:51:52.648Z] Exit code 1
NickM-27 commented 1 year ago

There have been some changes with the restructuring for the community supported boards framework. At this point, the dockerfile also needs to be updated to docker/tensorrt/Dockerfile.amd64

kevin-david commented 1 year ago

Thanks - with

    build:
      context: .
      dockerfile: docker/tensorrt/Dockerfile.amd64
      # Use target devcontainer-trt for TensorRT dev
      target: devcontainer-trt

Unfortunately still get an error:

[2023-09-09T22:29:01.669Z] Start: Run: docker inspect --type image devcontainer
[2023-09-09T22:29:01.685Z] Stop (16 ms): Run: docker inspect --type image devcontainer
[2023-09-09T22:29:01.781Z] Start: Run: docker-credential-secret get
[2023-09-09T22:29:01.784Z] Stop (3 ms): Run: docker-credential-secret get
[2023-09-09T22:29:01.787Z] Stop (6 ms): Run: docker-credential-secret get
[2023-09-09T22:29:02.131Z] Start: Run: docker-credential-secret get
[2023-09-09T22:29:02.135Z] Stop (4 ms): Run: docker-credential-secret get
[2023-09-09T22:29:02.137Z] Stop (6 ms): Run: docker-credential-secret get
[2023-09-09T22:29:02.298Z] Error fetching image details: No manifest found for docker.io/library/devcontainer.
[2023-09-09T22:29:02.298Z] Start: Run: docker pull devcontainer
[2023-09-09T22:29:02.311Z] Using default tag: latest
[2023-09-09T22:29:02.678Z] Error response from daemon: pull access denied for devcontainer, repository does not exist or may require 'docker login': denied: requested access to the resource is denied
[2023-09-09T22:29:02.679Z] Stop (381 ms): Run: docker pull devcontainer
[2023-09-09T22:29:02.679Z] []
[2023-09-09T22:29:02.679Z] Error response from daemon: No such image: devcontainer:latest

[2023-09-09T22:29:02.602Z] Error: Command failed: docker inspect --type image devcontainer
[2023-09-09T22:29:02.602Z]     at eAA (/home/kevin/.vscode-remote-containers/dist/dev-containers-cli-0.309.0/dist/spec-node/devContainersSpecCLI.js:409:3569)
[2023-09-09T22:29:02.602Z]     at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
[2023-09-09T22:29:02.603Z]     at async FAA (/home/kevin/.vscode-remote-containers/dist/dev-containers-cli-0.309.0/dist/spec-node/devContainersSpecCLI.js:479:3833)
[2023-09-09T22:29:02.603Z]     at async GC (/home/kevin/.vscode-remote-containers/dist/dev-containers-cli-0.309.0/dist/spec-node/devContainersSpecCLI.js:479:4775)
[2023-09-09T22:29:02.603Z]     at async VeA (/home/kevin/.vscode-remote-containers/dist/dev-containers-cli-0.309.0/dist/spec-node/devContainersSpecCLI.js:611:12240)
[2023-09-09T22:29:02.603Z]     at async WeA (/home/kevin/.vscode-remote-containers/dist/dev-containers-cli-0.309.0/dist/spec-node/devContainersSpecCLI.js:611:11981)
[2023-09-09T22:29:02.608Z] Stop (1747 ms): Run in Host: /home/kevin/.vscode-server/bin/8b617bd08fd9e3fc94d14adb8d358b56e3f72314/node /home/kevin/.vscode-remote-containers/dist/dev-containers-cli-0.309.0/dist/spec-node/devContainersSpecCLI.js up --container-session-data-folder /tmp/devcontainers-441c2909-2af6-4e7a-a328-6f15a869f13d1694298539224 --workspace-folder /home/kevin/dev/frigate --workspace-mount-consistency cached --id-label devcontainer.local_folder=/home/kevin/dev/frigate --id-label devcontainer.config_file=/home/kevin/dev/frigate/.devcontainer/devcontainer.json --log-level debug --log-format json --config /home/kevin/dev/frigate/.devcontainer/devcontainer.json --default-user-env-probe loginInteractiveShell --mount type=volume,source=vscode,target=/vscode,external=true --skip-post-create --update-remote-user-uid-default on --mount-workspace-git-root true
[2023-09-09T22:29:02.609Z] Exit code 1
[2023-09-09T22:29:02.613Z] Command failed: /home/kevin/.vscode-server/bin/8b617bd08fd9e3fc94d14adb8d358b56e3f72314/node /home/kevin/.vscode-remote-containers/dist/dev-containers-cli-0.309.0/dist/spec-node/devContainersSpecCLI.js up --container-session-data-folder /tmp/devcontainers-441c2909-2af6-4e7a-a328-6f15a869f13d1694298539224 --workspace-folder /home/kevin/dev/frigate --workspace-mount-consistency cached --id-label devcontainer.local_folder=/home/kevin/dev/frigate --id-label devcontainer.config_file=/home/kevin/dev/frigate/.devcontainer/devcontainer.json --log-level debug --log-format json --config /home/kevin/dev/frigate/.devcontainer/devcontainer.json --default-user-env-probe loginInteractiveShell --mount type=volume,source=vscode,target=/vscode,external=true --skip-post-create --update-remote-user-uid-default on --mount-workspace-git-root true
[2023-09-09T22:29:02.613Z] Exit code 1
NickM-27 commented 1 year ago

Right, looks like things might need to be reworked some so the docker compose knows to use the bake file

knoffelcut commented 12 months ago

I also experienced this issue.

With the restructuring the tensorrt backend was moved to another dockerfile. I could not find any way to build the target images (frigate-tensorrt nor devcontainer-trt) directly, not by using docker compose (specifying multiple dockerfiles) nor by specifying multiple dockerfiles with a docker build command.

I'm quite sure this is not a valid structure without using some external repository to keep intermediate images. This is incorrect, see response from NickM-27 below

As a stopgap solution, I built my target image by manually building and tagging the intermediate images at the end of each dockerfile. It can be used a temporary fix until this is properly resolved.

docker build \
    -f docker/main/Dockerfile \
    --target deps \
    -t deps:latest \
    .

docker build \
    -f docker/main/Dockerfile \
    --target rootfs \
    -t rootfs:latest \
    .

docker build \
    -f docker/main/Dockerfile \
    --target wheels \
    -t wheels:latest \
    .

docker build \
    -f docker/tensorrt/Dockerfile.base \
    --target tensorrt-base \
    -t tensorrt-base:latest \
    .

docker build \
    -f docker/tensorrt/Dockerfile.amd64 \
    --target frigate-tensorrt \
    -t frigate-tensorrt:latest \
    .

In my docker compose file I then simply specify the tagged image:

<snip>
  frigate:
    container_name: frigate
    privileged: false # this may not be necessary for all setups
    restart: unless-stopped
    image: frigate-tensorrt:latest
    shm_size: "64mb" # update for your cameras based on calculation above
    devices:
<snip>
NickM-27 commented 12 months ago

I'm quite sure this is not a valid structure without using some external repository to keep intermediate images.

That is incorrect, this is a valid structure using docker buildx bake

the docker compose just needs to be updated to use the bake file instead for the trt container

knoffelcut commented 12 months ago

I'm quite sure this is not a valid structure without using some external repository to keep intermediate images.

That is incorrect, this is a valid structure using docker buildx bake

the docker compose just needs to be updated to use the bake file instead for the trt container

Thanks, great. I failed to find that during my searches. Will post a build command using docker buildx bake when I get the time.

knoffelcut commented 12 months ago

The correct command to build seem to be

docker buildx bake --file=docker/tensorrt/trt.hcl tensorrt

using this command I get close to the end, but it crashes here

#89 [tensorrt frigate-tensorrt 1/5] RUN python3 --version
#89 0.706 Python 3.8.10
#89 DONE 1.2s

#92 [tensorrt frigate-tensorrt 2/5] RUN --mount=type=bind,from=trt-wheels,source=/trt-wheels,target=/deps/trt-wheels     pip3 install -U /deps/trt-wheels/*.whl &&     ldconfig
#92 1.161 ERROR: Cython-0.29.36-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl is not a supported wheel on this platform.
#92 ERROR: process "/bin/sh -c pip3 install -U /deps/trt-wheels/*.whl &&     ldconfig" did not complete successfully: exit code: 1
------
 > [tensorrt frigate-tensorrt 2/5] RUN --mount=type=bind,from=trt-wheels,source=/trt-wheels,target=/deps/trt-wheels     pip3 install -U /deps/trt-wheels/*.whl &&     ldconfig:
1.161 ERROR: Cython-0.29.36-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl is not a supported wheel on this platform.
------
Dockerfile.amd64:18
--------------------
  17 |     RUN python3 --version
  18 | >>> RUN --mount=type=bind,from=trt-wheels,source=/trt-wheels,target=/deps/trt-wheels \
  19 | >>>     pip3 install -U /deps/trt-wheels/*.whl && \
  20 | >>>     ldconfig
  21 |     
--------------------
ERROR: failed to solve: process "/bin/sh -c pip3 install -U /deps/trt-wheels/*.whl &&     ldconfig" did not complete successfully: exit code: 1

From my investigations it seems that the Python version in the tensorrt base image is Python 3.8, whereas the wheels were downloaded expecting Python 3.9. There was some work done to fix this for ARM64, but not for AMD64.

NickM-27 commented 12 months ago

The correct command to build seem to be

docker buildx bake --file=docker/tensorrt/trt.hcl tensorrt

we are trying to use the dev container here so the command should be docker buildx bake --file=docker/tensorrt/trt.hcl devcontainer-trt

https://github.com/blakeblackshear/frigate/blob/6aedc39a9a421cf48000a727f36b4c1495848a1d/docker/tensorrt/trt.hcl#L84

NickM-27 commented 12 months ago

There shouldn't be any issue building tensorrt target, that is done automatically on every PR to dev

https://github.com/blakeblackshear/frigate/blob/6aedc39a9a421cf48000a727f36b4c1495848a1d/.github/workflows/ci.yml#L40-L49

n1ght-hunter commented 4 months ago

im just trying to build it locally and running into this issue still. any chance we could fix it and close the issue.

building is fine but how do we set it up for local dev