Xilinx / Vitis-AI

Vitis AI is Xilinx’s development stack for AI inference on Xilinx hardware platforms, including both edge devices and Alveo cards.
https://www.xilinx.com/ai
Apache License 2.0
1.47k stars 630 forks source link

No CUDA runtime is found Vitis AI Docker 1.3.1 #351

Closed DaniilNelubin closed 1 year ago

DaniilNelubin commented 3 years ago

Hello everyone I have been using Vitis-AI 1.3.1 docker with GPU. I'm trying to quantize my own neural network And I got the following warning and error. How can I solve this?

No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'

[NNDCT_NOTE]: Loading NNDCT kernels...

Traceback (most recent call last):
  File "tool/model_quantazition.py", line 124, in <module>
    torch.cuda.set_device(args.local_rank)
  File "/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.6/site-packages/torch/cuda/__init__.py", line 292, in set_device
    torch._C._cuda_setDevice(device)
  File "/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.6/site-packages/torch/cuda/__init__.py", line 196, in _lazy_init
    _check_driver()
  File "/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.6/site-packages/torch/cuda/__init__.py", line 101, in _check_driver
    http://www.nvidia.com/Download/index.aspx""")
AssertionError: 
Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx
Traceback (most recent call last):
  File "/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.6/site-packages/torch/distributed/launch.py", line 263, in <module>
    main()
  File "/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.6/site-packages/torch/distributed/launch.py", line 259, in main
    cmd=cmd)
subprocess.CalledProcessError: Command '['/opt/vitis_ai/conda/envs/vitis-ai-pytorch/bin/python', '-u', 'tool/model_quantazition.py', '--local_rank=0', '--config=config/test/default_config_360_ss.yaml', '--fast_finetune', '--quant_mode=calib', '--deploy', 'dataset', 'voc2012', 'model_path', '/opt/project/init_360_torch12.pth']' returned non-zero exit status 1.
Eagle223 commented 3 years ago

You may need to run https://github.com/Xilinx/Vitis-AI/blob/master/setup/docker/docker_build_gpu.sh to build the Vitis-AI-gpu docker image

DaniilNelubin commented 3 years ago

Yeah, I built this .sh file and add some changes in DockerfileGPU. I added COPY command and pip install to copy and install mseg library. Also, I changed chmod on 777 in /opt/vitis_ai/conda/envs/vitis-ai-pytorch/ and set 777 to ALL chmod (I tried to fix permission denied error in pip install)

UPD I have next print for torch.cuda.is_avaliable() (Of course I used vitis-ai-pytorch conda env):

>>> import torch
>>> torch.cuda.is_available()
False

DockerfileGPU

FROM nvidia/cuda:10.0-cudnn7-runtime-ubuntu18.04
env DEBIAN_FRONTEND=noninteractive
SHELL ["/bin/bash", "-c"]
ENV TZ=America/Denver
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone
ENV VAI_ROOT=/opt/vitis_ai
ENV VAI_HOME=/vitis_ai_home
ARG VERSION
ENV VERSION=$VERSION
ARG DATE
ENV DATE=$DATE
ENV LANG en_US.UTF-8
ENV LANGUAGE en_US:en
ENV LC_ALL en_US.UTF-8

RUN chmod 777 /tmp \
    && mkdir /scratch \
    && chmod 777 /scratch \
    && apt-get update -y \
    && apt-get install -y --no-install-recommends \
        apt-transport-https \
        autoconf \
        automake \
        bc \
        build-essential \
        bzip2 \
        ca-certificates \
        curl \
        g++ \
        git \
        gnupg \
        libboost-all-dev \
        libgflags-dev \
        libgoogle-glog-dev \
        libgtest-dev \
        libjson-c-dev \
        libssl-dev \
        libtool \
        libunwind-dev \
        locales \
        make \
        openssh-client \
        openssl \
        python3 \
        python3-dev \
        python3-minimal \
        python3-numpy \
        python3-pip \
        python3-setuptools \
        python3-venv \
        software-properties-common \
        sudo \
        tree \
        unzip \
        vim \
        wget \
        yasm \
        zstd

RUN sed -i '/en_US.UTF-8/s/^# //g' /etc/locale.gen \
    && echo "LC_ALL=en_US.UTF-8" >> /etc/environment \
    && echo "LANG=en_US.UTF-8" > /etc/locale.conf \
    && locale-gen en_US.UTF-8 \
    && localedef -i en_US -c -f UTF-8 -A /usr/share/locale/locale.alias en_US.UTF-8 \
    && dpkg-reconfigure locales

# Tools for building vitis-ai-library in the docker container
RUN apt-get -y install \
        libgtest-dev \
        libeigen3-dev \
        rpm \
        libavcodec-dev \
        libavformat-dev \
        libswscale-dev \
        libgstreamer-plugins-base1.0-dev \
        libgstreamer1.0-dev \
        libgtk-3-dev \
        libpng-dev \
        libjpeg-dev \
        libopenexr-dev \
        libtiff-dev \
        libwebp-dev \
        libgtk2.0-dev \
        opencl-clhpp-headers \
        opencl-headers \
        pocl-opencl-icd \
    && add-apt-repository -y ppa:ubuntu-toolchain-r/test \
    && apt install -y gcc-8 g++-8 gcc-9 g++-9 \
    && wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | gpg --dearmor - | tee /etc/apt/trusted.gpg.d/kitware.gpg >/dev/null \
    && apt-add-repository 'deb https://apt.kitware.com/ubuntu/ bionic main' \
    && apt-get update -y \
    && apt-get install -y \
        cmake=3.16.0-0kitware1 \
        cmake-data=3.16.0-0kitware1 \
        kitware-archive-keyring \
    && apt-get install -y ffmpeg \
    && cd /usr/src/gtest \
    && mkdir -p build \
    && cd build \
    && cmake .. \
    && make \
    && make install

RUN pip3 install \
        Flask \
        setuptools \
        wheel

# Install XRT
RUN wget --progress=dot:mega -O xrt.deb https://www.xilinx.com/bin/public/openDownload?filename=xrt_202020.2.8.726_18.04-amd64-xrt.deb \
    && ls -lhd ./xrt.deb \
    && apt-get update -y \
    && apt-get install -y ./xrt.deb \
    && rm -fr /tmp/*

# glog 0.4.0
RUN cd /tmp \
    && wget --progress=dot:mega -O glog.0.4.0.tar.gz https://codeload.github.com/google/glog/tar.gz/v0.4.0 \
    && tar -xvf glog.0.4.0.tar.gz \
    && cd glog-0.4.0 \
    && ./autogen.sh \
    && mkdir build \
    && cd build \
    && cmake -DBUILD_SHARED_LIBS=ON .. \
    && make -j \
    && make install \
    && rm -fr /tmp/*

# protobuf 3.4.0
RUN cd /tmp; wget --progress=dot:mega https://codeload.github.com/google/protobuf/zip/v3.4.0 \
    && unzip v3.4.0 \
    && cd protobuf-3.4.0 \
    && ./autogen.sh \
    && ./configure \
    && make -j \
    && make install \
    && ldconfig \
    && rm -fr /tmp/*

# opencv 3.4.3
RUN export PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" \
    && cd /tmp; wget --progress=dot:mega https://github.com/opencv/opencv/archive/3.4.3.tar.gz \
    && tar -xvf 3.4.3.tar.gz \
    && cd opencv-3.4.3 \
    && mkdir build \
    && cd build \
    && cmake -DBUILD_SHARED_LIBS=ON .. \
    && make -j \
    && make install \
    && ldconfig \
    && export PATH="${VAI_ROOT}/conda/bin:${VAI_ROOT}/utility:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" \
    && rm -fr /tmp/*

# gflags 2.2.2
RUN cd /tmp; wget --progress=dot:mega https://github.com/gflags/gflags/archive/v2.2.2.tar.gz \
    && tar xvf v2.2.2.tar.gz \
    && cd gflags-2.2.2 \
    && mkdir build \
    && cd build \
    && cmake -DBUILD_SHARED_LIBS=ON .. \
    && make -j \
    && make install \
    && rm -fr /tmp/*

# pybind 2.5.0
RUN cd /tmp; git clone https://github.com/pybind/pybind11.git \
    && cd pybind11 \
    && git checkout v2.5.0 \
    && mkdir build \
    && cd build \
    && cmake -DPYBIND11_TEST=OFF .. \
    && make \
    && make install \
    && rm -fr /tmp/* \
    && chmod 777 /usr/lib/python3/dist-packages

RUN cd /tmp; wget --progress=dot:mega http://launchpadlibrarian.net/436533799/libjson-c4_0.13.1+dfsg-4_amd64.deb \
    && dpkg -i ./libjson-c4_0.13.1+dfsg-4_amd64.deb \
    && rm -fr /tmp/*

# Uncomment the lines below to add Petalinux SDK into Docker
# RUN cd /tmp \
#     && wget --progress=dot:mega -O sdk.sh https://www.xilinx.com/bin/public/openDownload?filename=sdk-2020.2.0.0.sh \
#     && chmod 777 /tmp/sdk.sh \
#     && /tmp/sdk.sh -y \
#     && rm -fr /tmp/* \
#     && chmod 777 /opt/petalinux/2020.2/sysroots/aarch64-xilinx-linux/

RUN echo $VERSION > /etc/VERSION.txt

ENV GOSU_VERSION 1.12

COPY docker/bashrc /etc/bash.bashrc
RUN chmod 777 /etc/bash.bashrc
RUN groupadd vitis-ai-group \
    && useradd --shell /bin/bash -c '' -m -g vitis-ai-group vitis-ai-user \
    && passwd -d vitis-ai-user \
    && usermod -aG sudo vitis-ai-user \
    && echo 'ALL ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers \
    && curl -sSkLo /usr/local/bin/gosu "https://github.com/tianon/gosu/releases/download/$GOSU_VERSION/gosu-$(dpkg --print-architecture)" \
    && chmod 777 /usr/local/bin/gosu \
    && echo ". $VAI_ROOT/conda/etc/profile.d/conda.sh" >> ~vitis-ai-user/.bashrc \
    && echo "conda activate base" >> ~vitis-ai-user/.bashrc \
    && echo "export VERSION=${VERSION}" >> ~vitis-ai-user/.bashrc \
    && echo "export BUILD_DATE=\"${DATE}\"" >> ~vitis-ai-user/.bashrc \
    && cat ~vitis-ai-user/.bashrc >> /root/.bashrc \
    && echo $VERSION > /etc/VERSION.txt \
    && echo $DATE > /etc/BUILD_DATE.txt \
    && echo 'export PS1="\[\e[91m\]Vitis-AI\[\e[m\] \w > "' >> ~vitis-ai-user/.bashrc \
    && mkdir -p ${VAI_ROOT} \
    && chown -R vitis-ai-user:vitis-ai-group ${VAI_ROOT}

# Set up Anaconda
USER vitis-ai-user
ENV MY_CONDA_CHANNEL="file:///scratch/conda-channel"

RUN cd /tmp \
    && wget --progress=dot:mega https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh \
    && /bin/bash ./miniconda.sh -b -p $VAI_ROOT/conda \
    && sudo ln -s $VAI_ROOT/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh

ADD --chown=vitis-ai-user:vitis-ai-group docker/gpu_conda/*.yml /scratch/
ADD --chown=vitis-ai-user:vitis-ai-group docker/pip_requirements.txt /scratch/
ADD --chown=vitis-ai-user:vitis-ai-group docker/pip_requirements_neptune.txt /scratch/

# Rebuild this layer every time
ARG CACHEBUST=1

# My comands
COPY docker/mseg-api/ /etc/mseg/mseg-api
COPY docker/mseg-semantic/ /etc/mseg/mseg-semantic

RUN cd /scratch/ \
    && wget -O conda-channel.tar.gz --progress=dot:mega https://www.xilinx.com/bin/public/openDownload?filename=conda-channel_1.3.598-01.tar.gz \
    && tar -xzvf conda-channel.tar.gz \
    && . $VAI_ROOT/conda/etc/profile.d/conda.sh \
    && conda install conda-build \
    && python3 -m pip install --upgrade pip wheel setuptools \
    && conda env create -f /scratch/vitis-ai-optimizer_darknet.yml \
    && conda env create -f /scratch/vitis-ai-optimizer_caffe.yml \
    && conda env create -f /scratch/vitis-ai-optimizer_tensorflow.yml \
    && conda env create -f /scratch/vitis-ai-lstm.yml \
        && conda activate vitis-ai-lstm \
        && pip install -r /scratch/pip_requirements.txt \
    && conda env create -f /scratch/vitis-ai-pytorch.yml \
        && conda activate vitis-ai-pytorch \
        && pip install -r /scratch/pip_requirements.txt \
        && sudo chown -R nobody:nogroup /etc/mseg/ \
        && sudo chmod -R 777 /etc/mseg/ \
        && sudo chmod -R 777 /opt/vitis_ai/conda/envs/vitis-ai-pytorch/ \
        && cd /home/vitis-ai-user/ \
        && sudo mkdir -p mseg_test/data \
        && cd /etc/mseg/mseg-api && pip install -e . \
        && cd /etc/mseg/mseg-semantic && pip install -e . \
        && cd /scratch/ \
    && conda env create -f /scratch/vitis-ai-caffe.yml \
        && conda activate vitis-ai-caffe \
        && pip install -r /scratch/pip_requirements.txt \
    && conda env create -f /scratch/vitis-ai-tensorflow.yml \
        && conda activate vitis-ai-tensorflow \
        && pip install -r /scratch/pip_requirements.txt \
    && conda env create -f /scratch/vitis-ai-tensorflow2.yml \
        && conda activate vitis-ai-tensorflow2 \
        && pip install -r /scratch/pip_requirements.txt \
        && pip install --ignore-installed tensorflow==2.3.0 \
    && mkdir -p $VAI_ROOT/compiler \
        && conda activate vitis-ai-caffe \
        && sudo cp -r $CONDA_PREFIX/lib/python3.6/site-packages/vaic/arch $VAI_ROOT/compiler/arch \
     && rm -fr /scratch/*

# Rebuild this layer every time
USER root
ARG CACHEBUST=1
RUN cd /tmp/ \
    && wget -O libunilog.deb https://www.xilinx.com/bin/public/openDownload?filename=libunilog_1.3.1-r31_amd64.deb \
    && wget -O libtarget-factory.deb https://www.xilinx.com/bin/public/openDownload?filename=libtarget-factory_1.3.1-r29_amd64.deb \
    && wget -O libxir.deb https://www.xilinx.com/bin/public/openDownload?filename=libxir_1.3.1-r42_amd64.deb \
    && wget -O libvart.deb https://www.xilinx.com/bin/public/openDownload?filename=libvart_1.3.1-r67_amd64.deb \
    && wget -O libvitis_ai_library.deb https://www.xilinx.com/bin/public/openDownload?filename=libvitis_ai_library_1.3.1-r69_amd64.deb \
    && wget -O libbutler-base.deb https://www.xilinx.com/bin/public/openDownload?filename=libbutler-base_1.3.0_amd64.deb \
    && wget -O librt-engine.deb https://www.xilinx.com/bin/public/openDownload?filename=librt-engine_1.3.0-r106_amd64.deb \
    && wget -O aks.deb https://www.xilinx.com/bin/public/openDownload?filename=aks_1.3.0-r43_amd64.deb \
    && apt-get install -y --no-install-recommends /tmp/*.deb \
    && rm -rf /tmp/* \
    && ldconfig

RUN apt-get clean -y \
    && rm -rf /var/lib/apt/lists/* \
    && rm -rf /scratch/*

# Set default build toolchain to GCC 9 for better performance
RUN update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-9 90 \
        --slave /usr/bin/g++ g++ /usr/bin/g++-9 \
        --slave /usr/bin/gcov gcov /usr/bin/gcov-9 \
    && update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-8 80 \
        --slave /usr/bin/g++ g++ /usr/bin/g++-8 \
        --slave /usr/bin/gcov gcov /usr/bin/gcov-8 \
    && update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-7 70 \
        --slave /usr/bin/g++ g++ /usr/bin/g++-7 \
        --slave /usr/bin/gcov gcov /usr/bin/gcov-7

ADD docker/banner.sh /etc/
almidi commented 3 years ago

I seem to have the exact problem. Did you find a solution ?

janifer112x commented 2 years ago

Hi @Eagle223, Could you please help try with our latest release(VAI 2.5)? and feel free to let me know if you still have problems.

qianglin-xlnx commented 1 year ago

Closing since no activity for more than half year, please open a new issue if you still have any questions. Thanks.