Xilinx / Vitis-AI

Vitis AI is Xilinx’s development stack for AI inference on Xilinx hardware platforms, including both edge devices and Alveo cards.
https://www.xilinx.com/ai
Apache License 2.0
1.47k stars 630 forks source link

Build the Docker Container from Xilinx Recipes #1157

Closed shoayi closed 1 year ago

shoayi commented 1 year ago

When i run ./docker_build.sh -t gpu -f opt_pytorch, i get the follow error: fatal: 不是 git 仓库(或者直至挂载点 / 的任何父目录) 停止在文件系统边界(未设置 GIT_DISCOVERY_ACROSS_FILESYSTEM)。 docker build --network=host --build-arg TARGET_FRAMEWORK=pytorch --build-arg DOCKER_TYPE=gpu --build-arg XRT_URL=https://www.xilinx.com/bin/public/openDownload?filename=xrt_202220.2.14.354_20.04-amd64-xrt.deb --build-arg XRM_URL=https://www.xilinx.com/bin/public/openDownload?filename=xrm_202220.1.5.212_20.04-x86_64.deb --build-arg VAI_BASE=xiinx/vitis-ai-gpu-base --build-arg PETALINUX_URL= --build-arg VAI_CONDA_CHANNEL=https://www.xilinx.com/bin/public/openDownload?filename=conda-channel-3.0.tar.gz --build-arg VAI_WEGO_CONDA_CHANNEL=https://www.xilinx.com/bin/public/openDownload?filename=conda-channel-wego-3.0.tar.gz --build-arg VAI_DEB_CHANNEL=https://www.xilinx.com/bin/public/openDownload?filename=vairuntime-3.0.tar.gz --build-arg VERSION=3.0.0.001 --build-arg GIT_HASH= --build-arg CACHEBUST=1675731297 --build-arg BUILD_DATE=2023-02-07 -f dockerfiles/ubuntu-vai/vitis-ai-cpu.Dockerfile -t xilinx/vitis-ai-pytorch-gpu:3.0.0.001 ./ [+] Building 1.9s (11/12)
=> [internal] load .dockerignore 0.5s => => transferring context: 2B 0.0s => [internal] load build definition from vitis-ai-cpu.Dockerfile 0.4s => => transferring dockerfile: 1.38kB 0.0s => [internal] load metadata for docker.io/xiinx/vitis-ai-gpu-base:latest 0.0s => [1/8] FROM docker.io/xiinx/vitis-ai-gpu-base 0.0s => [internal] load build context 0.1s => => transferring context: 12.07kB 0.0s => CACHED [2/8] WORKDIR /workspace 0.0s => CACHED [3/8] ADD ./common/ . 0.0s => CACHED [4/8] ADD ./conda /scratch 0.0s => CACHED [5/8] ADD conda/banner.sh /etc/ 0.0s => CACHED [6/8] ADD conda/gpu_conda/bashrc /etc/bash.bashrc 0.0s => ERROR [7/8] RUN if [[ -n "pytorch" ]]; then bash ./install_pytorch.sh; fi 1.3s


[7/8] RUN if [[ -n "pytorch" ]]; then bash ./install_pytorch.sh; fi:

0 0.226 + sudo chmod 777 /scratch

0 0.229 + [[ gpu == \c\p\u ]]

0 0.229 + ./install_torch.sh

0 0.229 + [[ https://www.xilinx.com/bin/public/openDownload?filename=conda-channel-3.0.tar.gz =~ .*tar.gz ]]

0 0.229 + cd /scratch/

0 0.229 + wget -O conda-channel.tar.gz --progress=dot:mega 'https://www.xilinx.com/bin/public/openDownload?filename=conda-channel-3.0.tar.gz'

0 0.230 --2023-02-06 17:54:59-- https://www.xilinx.com/bin/public/openDownload?filename=conda-channel-3.0.tar.gz

0 0.231 Resolving www.xilinx.com (www.xilinx.com)... 69.192.13.140

0 0.241 Connecting to www.xilinx.com (www.xilinx.com)|69.192.13.140|:443... connected.

0 0.298 HTTP request sent, awaiting response... 302 Moved Temporarily

0 1.077 Location: https://xilinx-ax-dl.entitlenow.com/dl/ul/2023/01/12/R210778902/conda-channel-3.0.tar.gz?hash=wGqhgsdXcnBom3pJI-tFKA&expires=1675763699&filename=conda-channel-3.0.tar.gz [following]

0 1.077 --2023-02-06 17:54:59-- https://xilinx-ax-dl.entitlenow.com/dl/ul/2023/01/12/R210778902/conda-channel-3.0.tar.gz?hash=wGqhgsdXcnBom3pJI-tFKA&expires=1675763699&filename=conda-channel-3.0.tar.gz

0 1.077 Resolving xilinx-ax-dl.entitlenow.com (xilinx-ax-dl.entitlenow.com)... 23.77.214.56

0 1.085 Connecting to xilinx-ax-dl.entitlenow.com (xilinx-ax-dl.entitlenow.com)|23.77.214.56|:443... connected.

0 1.121 Unable to establish SSL connection.


vitis-ai-cpu.Dockerfile:32

30 | ADD conda/banner.sh /etc/ 31 | ADD conda/${DOCKER_TYPE}_conda/bashrc /etc/bash.bashrc 32 | >>> RUN if [[ -n "${TARGETFRAMEWORK}" ]]; then bash ./install${TARGET_FRAMEWORK}.sh; fi 33 | USER root 34 | RUN mkdir -p ${VAI_ROOT}/conda/pkgs && chmod 777 ${VAI_ROOT}/conda/pkgs && ./install_vairuntime.sh && rm -fr ./*

ERROR: failed to solve: process "/bin/bash -c if [[ -n \"${TARGETFRAMEWORK}\" ]]; then bash ./install${TARGET_FRAMEWORK}.sh; fi" did not complete successfully: exit code: 4 Error response from daemon: No such image: xilinx/vitis-ai-pytorch-gpu:3.0.0.001

i hope you can help me. Thank you very much!

janifer112x commented 1 year ago

Hi @shoayi , I noticed that in the docker build command it cannot get the git hash.
--build-arg GIT_HASH= --build-arg CACHEBUST=1675731297

could you share how you get the Vitis-AI git hub code downloaded? it seems there is something wrong during the source code download, basic git command cannot proceed I also searched the error message, hope it can help you (https://blog.csdn.net/blueheart20/article/details/81117564)

another error from the log seems that it cannot download this link, Could you help also try from your docker host with this command? wget -O conda-channel.tar.gz --progress=dot:mega 'https://www.xilinx.com/bin/public/openDownload?filename=conda-channel-3.0.tar.gz'

shoayi commented 1 year ago

Hi @shoayi , I noticed that in the docker build command it cannot get the git hash. --build-arg GIT_HASH= --build-arg CACHEBUST=1675731297

could you share how you get the Vitis-AI git hub code downloaded? it seems there is something wrong during the source code download, basic git command cannot proceed I also searched the error message, hope it can help you (https://blog.csdn.net/blueheart20/article/details/81117564)

another error from the log seems that it cannot download this link, Could you help also try from your docker host with this command? wget -O conda-channel.tar.gz --progress=dot:mega 'https://www.xilinx.com/bin/public/openDownload?filename=conda-channel-3.0.tar.gz'

Hi,@janifer112x, I downloaded it manually on github. Have you successfully installed the gpu docker of vitis ai 3.0?

janifer112x commented 1 year ago

Hi @shoayi , yes, I can build VitisAI GPU docker by following the instruction.

shoayi commented 1 year ago

Hi @shoayi , yes, I can build VitisAI GPU docker by following the instruction.

Hi @janifer112x, Excuse me, are you in China? When i run ./docker_build.sh -t gpu -f opt_pytorch, i get the following error:

0 3.201 Connecting to xilinx-ax-dl.entitlenow.com (xilinx-ax-dl.entitlenow.com)|23.39.0.57|:443... connected.

0 3.415 Unable to establish SSL connection.

Do you use network proxy?

janifer112x commented 1 year ago

Hi @shoayi , I launched the command from USA, there is no problem reported. do you have problem download the package from this link from your computer directly? wget -O conda-channel.tar.gz --progress=dot:mega 'https://www.xilinx.com/bin/public/openDownload?filename=conda-channel-3.0.tar.gz'

shoayi commented 1 year ago

Hi @janifer112x, I execute the command and get the following results: --2023-02-08 11:39:59-- https://www.xilinx.com/bin/public/openDownload?filename=conda-channel-3.0.tar.gz 正在解析主机 www.xilinx.com (www.xilinx.com)... 23.2.130.223 正在连接 www.xilinx.com (www.xilinx.com)|23.2.130.223|:443... 已连接。 已发出 HTTP 请求,正在等待回应... 302 Moved Temporarily 位置:https://xilinx-ax-dl.entitlenow.com/dl/ul/2023/01/12/R210778902/conda-channel-3.0.tar.gz?hash=s6JN6h75vdgF5G6K4YwGgA&expires=1675860002&filename=conda-channel-3.0.tar.gz [跟随至新的 URL] --2023-02-08 11:40:02-- https://xilinx-ax-dl.entitlenow.com/dl/ul/2023/01/12/R210778902/conda-channel-3.0.tar.gz?hash=s6JN6h75vdgF5G6K4YwGgA&expires=1675860002&filename=conda-channel-3.0.tar.gz 正在解析主机 xilinx-ax-dl.entitlenow.com (xilinx-ax-dl.entitlenow.com)... 23.77.214.56 正在连接 xilinx-ax-dl.entitlenow.com (xilinx-ax-dl.entitlenow.com)|23.77.214.56|:443... 已连接。 无法建立 SSL 连接。

janifer112x commented 1 year ago

Hi @shoayi , I tried with my laptop within Chinese network, it can also wget the package, Could you double check with your local company IT staff? --2023-02-08 12:31:01-- https://xilinx-ax-dl.entitlenow.com/dl/ul/2023/01/12/R210778902/conda-channel-3.0.tar.gz?hash=sKBT2V8Vd40MyYKN3DH6gw&expires=1675863061&filename=conda-channel-3.0.tar.gz Resolving xilinx-ax-dl.entitlenow.com (xilinx-ax-dl.entitlenow.com)... 23.77.214.56 Connecting to xilinx-ax-dl.entitlenow.com (xilinx-ax-dl.entitlenow.com)|23.77.214.56|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 674834728 (644M) [application/x-gzip] Saving to: ‘conda-channel.tar.gz’

shoayi commented 1 year ago

Hi @janifer112x, thank you very much for your help. I solve the internet problem and try again ./docker_build.sh -t gpu -f opt_pytorch, it occur another error in this time.

0 1878.3 + export VAI_CONDA_CHANNEL=file:///scratch/conda-channel

0 1878.3 + VAI_CONDA_CHANNEL=file:///scratch/conda-channel

0 1878.3 + VAI_ROOT=/opt/vitis_ai

0 1878.3 + mkdir -p /opt/vitis_ai/

0 1878.3 + [[ gpu == \r\o\c\m ]]

0 1878.3 + command_opt=' mamba env create -v -f /scratch/gpu_conda/vitis-ai-optimizer_pytorch.yml '

0 1878.3 + . /opt/vitis_ai/conda/etc/profile.d/conda.sh

0 1878.3 ./install_opt_pytorch.sh: line 37: /opt/vitis_ai/conda/etc/profile.d/conda.sh: No such file or directory

0 1878.3 + true

0 1878.3 + conda config --env --append channels file:///scratch/conda-channel

0 1878.3 ./install_opt_pytorch.sh: line 42: conda: command not found

0 1878.3 + true

0 1878.3 + mamba env create -v -f /scratch/gpu_conda/vitis-ai-optimizer_pytorch.yml

0 1878.3 ./install_opt_pytorch.sh: line 45: mamba: command not found

HelderJustino commented 1 year ago

Hi Guys, i am also having some issues with the recepies. They seem to stop working for me in the last week. I can't build the docker in any form. bellow the error that I got:

11 430.6 + conda config --env --add channels file:///scratch/conda-channel-wego/wegotf1

11 430.6 + local cmd=config

11 430.6 + case "$cmd" in

11 430.6 + __conda_exe config --env --add channels file:///scratch/conda-channel-wego/wegotf1

11 430.6 + __add_sys_prefix_to_path

11 430.6 + '[' -n '' ']'

11 430.6 ++ dirname /opt/vitis_ai/conda/bin/conda

11 430.6 + SYSP=/opt/vitis_ai/conda/bin

11 430.6 ++ dirname /opt/vitis_ai/conda/bin

11 430.6 + SYSP=/opt/vitis_ai/conda

11 430.6 + '[' -n '' ']'

11 430.6 + PATH=/opt/vitis_ai/conda/bin:/opt/vitis_ai/conda/condabin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

11 430.6 + export PATH

11 430.6 + /opt/vitis_ai/conda/bin/conda config --env --add channels file:///scratch/conda-channel-wego/wegotf1#11 430.8 + conda config --remove channels defaults

11 430.8 + local cmd=config

11 430.8 + case "$cmd" in

11 430.8 + __conda_exe config --remove channels defaults

11 430.8 + __add_sys_prefix_to_path

11 430.8 + '[' -n '' ']'

11 430.8 ++ dirname /opt/vitis_ai/conda/bin/conda

11 430.8 + SYSP=/opt/vitis_ai/conda/bin

11 430.8 ++ dirname /opt/vitis_ai/conda/bin

11 430.8 + SYSP=/opt/vitis_ai/conda

11 430.8 + '[' -n '' ']'

11 430.8 + PATH=/opt/vitis_ai/conda/bin:/opt/vitis_ai/conda/condabin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

11 430.8 + export PATH

11 430.8 + /opt/vitis_ai/conda/bin/conda config --remove channels defaults

11 431.0 + cat /home/vitis-ai-user/.condarc

11 431.0 channels:

11 431.0 - file:///scratch/conda-channel-wego/wegotf1

11 431.0 + mamba env create -v -f /scratch/gpu_conda/vitis-ai-tensorflow.yml

11 431.3 Traceback (most recent call last):

11 431.3 File "/opt/vitis_ai/conda/condabin/mamba", line 7, in

11 431.3 from mamba.mamba import main

11 431.3 File "/opt/vitis_ai/conda/lib/python3.9/site-packages/mamba/mamba.py", line 51, in

11 431.3 from mamba import repoquery as repoquery_api

11 431.3 File "/opt/vitis_ai/conda/lib/python3.9/site-packages/mamba/repoquery.py", line 9, in

11 431.3 from mamba.utils import init_api_context, load_channels

11 431.3 File "/opt/vitis_ai/conda/lib/python3.9/site-packages/mamba/utils.py", line 17, in

11 431.3 from conda.core.index import _supplement_index_with_system, check_allowlist

11 431.3 ImportError: cannot import name 'check_allowlist' from 'conda.core.index' (/opt/vitis_ai/conda/lib/python3.9/site-packages/conda/core/index.py)

executor failed running [/bin/bash -c if [[ -n "${TARGETFRAMEWORK}" ]]; then bash ./install${TARGET_FRAMEWORK}.sh; fi]: exit code: 1 Error response from daemon: No such image: xilinx/vitis-ai-tensorflow-gpu:3.0.0.001

shoayi commented 1 year ago

Hi @HelderJustino , sorry, I did not encounter this error, but you can try the following command again ./docker_build.sh -t gpu -f opt_pytorch

HelderJustino commented 1 year ago

Hi @shoayi , I got the error when i try the following commands: ./docker_build.sh -t cpu -f tf1 ./docker_build.sh -t gpu -f tf1 ./docker_build.sh -t gpu -f tf2

The commanda works : ./docker_build.sh -t gpu -f pytorch

I am trying to work with a TensorFlow workflow.

lecramdev commented 1 year ago

Hi, I have the same problem as @HelderJustino . I am on the v3.0 tag and run "./docker_build.sh -t gpu -f opt_tf2". The same with tf2 instead of opt_tf2:

#0 51.87 + mamba env create -f /scratch/gpu_conda/vitis-ai-optimizer_tensorflow2.yml
#0 51.99 Traceback (most recent call last):
#0 51.99   File "/opt/vitis_ai/conda/condabin/mamba", line 7, in <module>
#0 51.99     from mamba.mamba import main
#0 51.99   File "/opt/vitis_ai/conda/lib/python3.9/site-packages/mamba/mamba.py", line 51, in <module>
#0 51.99     from mamba import repoquery as repoquery_api
#0 51.99   File "/opt/vitis_ai/conda/lib/python3.9/site-packages/mamba/repoquery.py", line 9, in <module>
#0 51.99     from mamba.utils import init_api_context, load_channels
#0 51.99   File "/opt/vitis_ai/conda/lib/python3.9/site-packages/mamba/utils.py", line 17, in <module>
#0 51.99     from conda.core.index import _supplement_index_with_system, check_allowlist
#0 51.99 ImportError: cannot import name 'check_allowlist' from 'conda.core.index' (/opt/vitis_ai/conda/lib/python3.9/site-packages/conda/core/index.py)
janifer112x commented 1 year ago

Hi @lecramdev, @HelderJustino I've pushed the fix into master branch, could you please try again and let me know if it fixes?

MohamedRaizz commented 1 year ago

I'm installing docker in vitis 3.0 (master) based on docker image vitis 2.0 and 3.0 is too different can u please tell me this is correct or not

user@user~/Vitis-AI$ ./docker_run.sh xilinx/vitis-ai-tensorflow-cpu:3.0.0.001 Error response from daemon: manifest for xilinx/vitis-ai-tensorflow-cpu:3.0.0.001 not found: manifest unknown: manifest unknown Setting up iwave 's environment in the Docker container... usermod: no changes Running as vitis-ai-user with ID 0 and group 0

==========================================


\ \ / () | () /\ | | \ \ / / | | _ __ / \ | | \ \/ / | | | / |__/ /\ \ | | \ / | | || _ \ / ____ \ | | \/ ||_||/ // __|

==========================================

Docker Image Version: 3.0.0.001 (CPU) Vitis AI Git Hash: 8ea8e972d Build Date: 2023-02-20 WorkFlow: tf1

vitis-ai-user@user:/workspace$

HelderJustino commented 1 year ago

Hi @lecramdev, @HelderJustino I've pushed the fix into master branch, could you please try again and let me know if it fixes?

This seem to fix the problem for me.

janifer112x commented 1 year ago

I'm installing docker in vitis 3.0 (master) based on docker image vitis 2.0 and 3.0 is too different can u please tell me this is correct or not

user@user~/Vitis-AI$ ./docker_run.sh xilinx/vitis-ai-tensorflow-cpu:3.0.0.001 Error response from daemon: manifest for xilinx/vitis-ai-tensorflow-cpu:3.0.0.001 not found: manifest unknown: manifest unknown Setting up iwave 's environment in the Docker container... usermod: no changes Running as vitis-ai-user with ID 0 and group 0

==========================================

\ \ / () | () /\ | | \ \ / / | | _ __ / \ | | \ / / | | | / |****__/ /\ \ | | \ / | | || \ / \ | | / |_|||/ /_/ __|

==========================================

Docker Image Version: 3.0.0.001 (CPU) Vitis AI Git Hash: 8ea8e97 Build Date: 2023-02-20 WorkFlow: tf1

vitis-ai-user@user:/workspace$

it is correct, @MohamedRaizz

MohamedRaizz commented 1 year ago

@janifer112x @HelderJustino @shoayi I try to run AI application I also got same issue and i check /etc/vart.conf file that dpu path is correctly i hope u can help me.... root@Petalinux:/run/media/sda1/facedetect# ./test_video_facedetect cf_densebox_wider_320_320_0.49G_2.0 test_45.mp4 -t 1 XRT build version: 2.14.0 Build hash: 43926231f7183688add2dccfd391b36a1f000bea[ 1358.108242] audit: type=1701 audit(1637343717.748:51): auid=4294967295 uid=0 gid=0 ses=4294967295 pid=1544 comm="test_video_face" exe="/run/media/sd1

Build date: 2022-10-07 05:12:02 Git branch: 2022.2 PID: 1544 UID: 0 [Fri Nov 19 17:41:57 2021 GMT] HOST: EXE: /run/media/sda1/facedetect21.1/test_video_facedetect [XRT] ERROR: DPUCZDX8G:DPUCZDX8G_1 not found [XRT] ERROR: xclRegRW: invalid CU index: -2 WARNING: Logging before InitGoogleLogging() is written to STDERR F1119 17:41:57.751024 1544 xrt_device_handle_imp.cpp:96] Check failed: read_result == 0 (-22 vs. 0) xclRead has error! Check failure stack trace: Aborted root@Petalinux:/run/media/sda1/facedetect# xdputil query XRT build version: 2.14.0 Build hash: 43926231f7183688add2dccfd391b36a1f000bea[ 1430.106976] audit: type=1701 audit(1637343789.748:52): auid=4294967295 uid=0 gid=0 ses=4294967295 pid=1551 comm="python3" exe="/usr/bin/python3.9" s1

Build date: 2022-10-07 05:12:02 Git branch: 2022.2 PID: 1551 UID: 0 [Fri Nov 19 17:43:09 2021 GMT] HOST: EXE: /usr/bin/python3.9 [XRT] ERROR: DPUCZDX8G:DPUCZDX8G_1 not found [XRT] ERROR: xclRegRW: invalid CU index: -2 WARNING: Logging before InitGoogleLogging() is written to STDERR F1119 17:43:09.749843 1551 xrt_device_handle_imp.cpp:96] Check failed: read_result == 0 (-22 vs. 0) xclRead has error! Check failure stack trace: /usr/bin/xdputil: line 20: 1551 Aborted /usr/bin/python3 -m xdputil $*

janifer112x commented 1 year ago

Hi @MohamedRaizz , From the log, it seems you try to run with edge boards? it is not relevant with VitisAI docker build process? Could you please file another issue so the teams can help you out with that topic? if no objection, I will close this issue.

marambles commented 1 year ago

Error response from daemon: No such image: xilinx/vitis-ai-opt-pytorch-gpu:3.0.0.001 May I ask who knows the solution to this problem, thank you very much

janifer112x commented 1 year ago

Hi @marambles , for GPU docker, you need to build it from the dockerfile. please refer to this link https://xilinx.github.io/Vitis-AI/docs/install/install.html#option-2-build-the-docker-container-from-xilinx-recipes

marambles commented 1 year ago

hi,janifer112x Thank you very much for your help

At 2023-03-01 11:28:47, "janifer112x" @.***> wrote:

Hi @marambles , for GPU docker, you need to build it from the dockerfile. please refer to this link https://xilinx.github.io/Vitis-AI/docs/install/install.html#option-2-build-the-docker-container-from-xilinx-recipes

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

marambles commented 1 year ago

hi,janifer112x I used this tutorial to install, but the above error still cannot be resolved

At 2023-03-01 11:28:47, "janifer112x" @.***> wrote:

Hi @marambles , for GPU docker, you need to build it from the dockerfile. please refer to this link https://xilinx.github.io/Vitis-AI/docs/install/install.html#option-2-build-the-docker-container-from-xilinx-recipes

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

AdrFebles commented 1 year ago

Hi @janifer112x, I have the same error with v3.0 Vitis AI branch.

The command '/bin/bash -c if [[ -n "${TARGETFRAMEWORK}" ]]; then bash ./install${TARGET_FRAMEWORK}.sh; fi' returned a non-zero code: 1 Error response from daemon: No such image: xilinx/vitis-ai-pytorch-gpu:3.0.0.001

Can you help me to solve it, please?