osai-ai / tensor-stream

A library for real-time video stream decoding to CUDA memory
GNU Lesser General Public License v2.1
380 stars 44 forks source link

TensorStreamConverter reads only the first image and then fails #38

Open AlexanderNevarko opened 7 months ago

AlexanderNevarko commented 7 months ago

TensorStreamConverter initializes successfully, reads the first image correctly (checked by saving and viewing an image). But when I try to read images further it return RuntimeError

Снимок экрана 2024-04-13 в 16 25 54
AlexanderNevarko commented 7 months ago

Docker file

# syntax=docker/dockerfile:1

# Does not work because of RTSP connection
# FROM nvcr.io/nvidia/pytorch:23.09-py3 AS base

# Variables used at build time.
## Base CUDA version. See all supported version at https://hub.docker.com/r/nvidia/cuda/tags?page=2&name=-devel-ubuntu
ARG CUDA_VERSION=11.8.0
## Base Ubuntu version.
ARG OS_VERSION=22.04

# Define base image.
FROM nvidia/cuda:${CUDA_VERSION}-cudnn8-devel-ubuntu${OS_VERSION} AS base

# Dublicate args because of the visibility zone
# https://docs.docker.com/engine/reference/builder/#understand-how-arg-and-from-interact
ARG CUDA_VERSION
ARG OS_VERSION

## Base TensorRT version.
ARG TRT_VERSION=8.6.1.6
## Base PyTorch version.
ARG TORCH_VERSION=2.2.0
## Base TorchVision version.
ARG TORCHVISION_VERSION=0.17.0
## Base OpenCV version.
ARG OPENCV_VERSION=4.8.0.74
## Base CMake version.
ARG CMAKE_VERSION=3.26.0
## Base Timezone
ARG TZ=Europe/Moscow

# Set environment variables.
## Set non-interactive to prevent asking for user inputs blocking image creation.
ENV DEBIAN_FRONTEND=noninteractive \
    ## Set timezone as it is required by some packages.
    TZ=${TZ} \
    ## CUDA Home, required to find CUDA in some packages.
    CUDA_HOME="/usr/local/cuda" \
    ## Set LD_LIBRARY_PATH for local libs (glog etc.)
    LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:/usr/local/lib" \
    ## Accelerate compilation flags (use all cores)
    MAKEFLAGS=-j$(nproc) \
    ## Torch GPU arch list
    TORCH_CUDA_ARCH_LIST="6.0 6.1 7.0 7.5 8.0 8.6 8.9"

RUN apt-get update && \
    apt-get install \
        --no-install-recommends \
        --yes \
            # Requirements from tensor stream
            build-essential \
            yasm \
            nasm \
            unzip \
            git \
            wget \
            sysstat \
            libtcmalloc-minimal4 \
            pkgconf \
            autoconf \
            libtool \
            flex \
            bison \
            libx264-dev \
            python3 \
            python3-pip \
            python3-dev \
            python3-setuptools \
            # Requirements from ultralytics
            libgl1 \
            libglib2.0-0 \
            gnupg \
            libusb-1.0-0 \
            # Linux security updates
            # https://security.snyk.io/vuln/SNYK-UBUNTU1804-OPENSSL-3314796
            openssl \
            tar \
            # Image I/O libs
            libjpeg-dev \
            libpng-dev \
            libtiff-dev \
            # Parallelism library C++ for CPU
            libtbb-dev \
            # Optimization libraries for OpenCV
            libatlas-base-dev \
            gfortran \
            # Video/Audio Libs - FFMPEG, GSTREAMER, x264 and so on.
            ## AV Lib [does not work with tensor_stream]
            # libavcodec-dev \
            # libavformat-dev \
            # libswscale-dev \
            ## Gstreamer
            libgstreamer1.0-dev \
            libgstreamer-plugins-base1.0-dev \
            ## Others
            libxvidcore-dev \
            x264 \
            libx264-dev && \
    ## Clean cached files
    ln -s /usr/bin/python3 /usr/bin/python && \
    apt-get clean --yes && \
    apt-get autoremove --yes && \
    rm -rf /var/lib/apt/lists/* && \
    rm -rf /var/cache/apt/archives/* && \
    ## Set timezone
    ln -snf /usr/share/zoneinfo/${TZ} /etc/localtime && \
    echo ${TZ} > /etc/timezone

SHELL ["/bin/bash", "-c"]

# Install CMake
RUN wget https://github.com/Kitware/CMake/releases/download/v${CMAKE_VERSION}/cmake-${CMAKE_VERSION}-Linux-x86_64.sh \
    -q -O /tmp/cmake-install.sh \
    && chmod u+x /tmp/cmake-install.sh \
    && mkdir /usr/bin/cmake \
    && /tmp/cmake-install.sh --skip-license --prefix=/usr/bin/cmake \
    && rm /tmp/cmake-install.sh

ENV PATH="/usr/bin/cmake/bin:${PATH}"

# Install TensorRT
## Now only supported for Ubuntu 22.04
## Cannot install via pip because cuda-based errors
RUN v="${TRT_VERSION}-1+cuda${CUDA_VERSION%.*}" distro="ubuntu${OS_VERSION//./}" arch=$(uname -m) && \
    wget https://developer.download.nvidia.com/compute/cuda/repos/${distro}/${arch}/cuda-archive-keyring.gpg && \
    mv cuda-archive-keyring.gpg /usr/share/keyrings/cuda-archive-keyring.gpg && \
    echo "deb [signed-by=/usr/share/keyrings/cuda-archive-keyring.gpg] https://developer.download.nvidia.com/compute/cuda/repos/${distro}/${arch}/ /" | \
    tee /etc/apt/sources.list.d/cuda-${distro}-${arch}.list && \
    apt-get update && \
    apt-get install \
        libnvinfer-headers-dev=${v} \
        libnvinfer-dispatch8=${v} \
        libnvinfer-lean8=${v} \
        libnvinfer-dev=${v} \
        libnvinfer-headers-plugin-dev=${v} \
        libnvinfer-lean-dev=${v} \
        libnvinfer-dispatch-dev=${v} \
        libnvinfer-plugin-dev=${v} \
        libnvinfer-vc-plugin-dev=${v} \
        libnvparsers-dev=${v} \
        libnvonnxparsers-dev=${v} \
        libnvinfer8=${v} \
        libnvinfer-plugin8=${v} \
        libnvinfer-vc-plugin8=${v} \
        libnvparsers8=${v} \
        libnvonnxparsers8=${v} && \
    apt-get install \
        python3-libnvinfer=${v} \
        tensorrt-dev=${v} && \
    apt-mark hold tensorrt-dev

# Build nvidia codec headers
RUN git clone -b sdk/11.1 --single-branch https://git.videolan.org/git/ffmpeg/nv-codec-headers.git && \
    cd nv-codec-headers && make install && \
    cd .. && rm -rf nv-codec-headers

# Build ffmpeg with nvenc support
RUN git clone --depth 1 -b release/6.0 --single-branch https://github.com/FFmpeg/FFmpeg.git && \
    cd FFmpeg && \
    mkdir ffmpeg_build && cd ffmpeg_build && \
    ../configure \
    --enable-cuda \
    --enable-cuvid \
    --enable-shared \
    --disable-static \
    --disable-doc \
    --extra-cflags=-I/usr/local/cuda/include \
    --extra-ldflags=-L/usr/local/cuda/lib64 \
    --extra-libs=-lpthread \
    --nvccflags="-arch=sm_60 \
                -gencode=arch=compute_60,code=sm_60 \
                -gencode=arch=compute_61,code=sm_61 \
                -gencode=arch=compute_70,code=sm_70 \
                -gencode=arch=compute_75,code=sm_75 \
                -gencode=arch=compute_80,code=sm_80 \
                -gencode=arch=compute_86,code=sm_86 \
                -gencode=arch=compute_89,code=sm_89 \
                -gencode=arch=compute_89,code=compute_89" && \
    make -j$(nproc) && make install && ldconfig && \
    cd ../.. && rm -rf FFmpeg

# Install torch
RUN python3 -m pip install \
    --upgrade \
    --no-cache \
        pip \
        wheel \
        setuptools \
        twine \
        awscli \
        packaging \
        ninja && \
    ## Install pytorch and submodules
    CUDA_VER=${CUDA_VERSION%.*} && CUDA_VER=${CUDA_VER//./} && \
    python3 -m pip install \
        --no-cache \
            torch==${TORCH_VERSION} \
            torchvision==${TORCHVISION_VERSION} \
                --index-url https://download.pytorch.org/whl/cu${CUDA_VER}

# Create working directory
WORKDIR /usr/src/

# Install TensorStream
RUN git clone -b master --single-branch https://github.com/osai-ai/tensor-stream.git /usr/src/tensor-stream && \
    cd /usr/src/tensor-stream && \
    python3 setup.py install

# Install nkb-tech detection pipeline
RUN git clone -b main https://github.com/nkb-tech/ultralytics.git /usr/src/ultralytics && \
    cd ultralytics && \
    python3 -m pip install \
        --no-cache \
        --editable \
            ".[export]" \
            albumentations

# Install nkb-tech cv pipeline
# Use copy because of it is private project
COPY . /usr/src/app
RUN cd /usr/src/app && \
    python3 -m pip install \
        --no-cache \
        --requirement \
            requirements.txt
    # cd cranpose && \
    # python3 -m pip install .

# Install OpenCV with CUDA
RUN python3 -m pip uninstall \
        --yes \
            opencv-contrib-python \
            opencv-python-headless \
            opencv-python && \
    ln -s /usr/include/x86_64-linux-gnu/cudnn_version_v8.h /usr/include/x86_64-linux-gnu/cudnn_version.h && \
    git clone --depth 1 -b ${OPENCV_VERSION%.*} https://github.com/opencv/opencv.git /usr/src/opencv && \
    git clone --depth 1 -b ${OPENCV_VERSION%.*} https://github.com/opencv/opencv_contrib.git /usr/src/opencv_contrib && \
    cd /usr/src/opencv && \
    mkdir build && \
    cd build && \
    cmake \
        -D CPACK_BINARY_DEB=ON \
        -D BUILD_EXAMPLES=OFF \
        -D INSTALL_C_EXAMPLES=OFF \
        -D BUILD_opencv_cudacodec=ON \
        -D BUILD_opencv_python2=OFF \
        -D BUILD_opencv_python3=ON \
        -D BUILD_opencv_java=OFF \
        -D CMAKE_BUILD_TYPE=RELEASE \
        -D CMAKE_INSTALL_PREFIX=/usr/local \
        -D CUDA_ARCH_BIN=6.0,6.1,7.0,7.5,8.0,8.6,8.9 \
        -D CUDA_ARCH_PTX= \
        -D ENABLE_FAST_MATH=ON \
        -D CUDA_FAST_MATH=ON \
        -D CUDNN_INCLUDE_DIR=/usr/include/x86_64-linux-gnu \
        -D EIGEN_INCLUDE_PATH=/usr/include/eigen3 \
        -D WITH_EIGEN=ON \
        -D ENABLE_NEON=OFF \
        -D OPENCV_DNN_CUDA=ON \
        -D OPENCV_ENABLE_NONFREE=ON \
        -D OPENCV_EXTRA_MODULES_PATH=/usr/src/opencv_contrib/modules \
        -D OPENCV_GENERATE_PKGCONFIG=ON \
        -D WITH_CUBLAS=ON \
        -D WITH_CUDA=ON \
        -D WITH_CUDNN=ON \
        -D WITH_GSTREAMER=ON \
        -D WITH_LIBV4L=ON \
        -D WITH_OPENGL=ON \
        -D WITH_OPENCL=OFF \
        -D WITH_IPP=OFF \
        -D WITH_TBB=ON \
        -D WITH_TIFF=ON \
        -D WITH_JPEG=ON \
        -D WITH_PNG=ON \
        -D BUILD_PERF_TESTS=OFF \
        -D BUILD_TESTS=OFF \
        -D WITH_QT=OFF \
        -D BUILD_DOCS=OFF \
       .. && \
    make -j$(nproc) && \
    make install && \
    ldconfig
BykadorovR commented 7 months ago

Have you tried running samples? Do they work correctly? Runtime error 700 means something wrong with CUDA/GPU processing. Do you have enough free space on GPU? Can you enable logs and share them? https://github.com/osai-ai/tensor-stream/blob/ba0df3cdf278d9174ee93a97023bd92887a0a7e9/python_examples/simple.py#L37

AlexanderNevarko commented 7 months ago
root@e39c53d39e2d:/usr/src/tensor-stream/python_examples# python3 simple.py -i "rtsp://Stream1:...(it's a valid stream)" -o "/dl/out.mp4" -v "HIGH"
TID: 139705051231360 Initializing()  +
TID: 139705051231360 Chosen GPU: 0
TID: 139705051231360 Tensor CUDA init +
TID: 139705051231360 Tensor CUDA init -
time: 1116 ms
TID: 139705051231360 parser->Init +
TID: 139705051231360 parser->Init -
time: 1676 ms
TID: 139705051231360 decoder->Init +
TID: 139705051231360 decoder->Init -
time: 0 ms
TID: 139705051231360 VPP->Init +
TID: 139705051231360 Max consumers allowed: 5
TID: 139705051231360 VPP->Init -
time: 2 ms
TID: 139705051231360 Frame rate in bitstream hasn't been found, using guessed value
TID: 139705051231360 Frame rate: 20
TID: 139705051231360 Initializing()  -
Function time: 2796ms

TID: 139699029014080 Processing() 1 frame +
TID: 139699029014080 parser->Read +
TID: 139699029014080 parser->Read -
time: 0 ms
TID: 139699029014080 parser->Get +
TID: 139699029014080 parser->Get -
time: 0 ms
TID: 139699029014080 parser->Analyze +
TID: 139699029014080 [PARSING] Bitstream doesn't conform to the Main profile 77
TID: 139699029014080 [PARSING] Field gaps_in_frame_num_value_allowed_flag is unexpected != 0
TID: 139699029014080 parser->Analyze -
time: 0 ms
TID: 139699029014080 decoder->Decode +
Normalize (0, 0, 0, 0)
TID: 139705051231360 GetFrame() +
TID: 139705051231360 findFree decoded frame +
TID: 139705051231360 findFree decoded frame -
time: 0 ms
TID: 139705051231360 findFree converted frame +
TID: 139705051231360 findFree converted frame -
time: 0 ms
TID: 139705051231360 decoder->GetFrame +
TID: 139699029014080 decoder->Decode -
time: 108 ms
TID: 139699029014080 check tensor to free +
TID: 139699029014080 check tensor to free -
time: 0 ms
TID: 139699029014080 sleep +
TID: 139699029014080 Dts: 0 now: 0
TID: 139699029014080 Should sleep for: 0
TID: 139699029014080 sleep -
time: 0 ms
TID: 139699029014080 Processing() 1 frame -
Function time: 109ms

TID: 139699029014080 Processing() 2 frame +
TID: 139699029014080 parser->Read +
TID: 139699029014080 parser->Read -
time: 0 ms
TID: 139699029014080 parser->Get +
TID: 139699029014080 parser->Get -
time: 0 ms
TID: 139699029014080 parser->Analyze +
TID: 139699029014080 parser->Analyze -
time: 0 ms
TID: 139699029014080 decoder->Decode +
TID: 139705051231360 decoder->GetFrame -
time: 108 ms
TID: 139705051231360 vpp->Convert +
TID: 139699029014080 decoder->Decode -
time: 15 ms
TID: 139699029014080 check tensor to free +
TID: 139699029014080 check tensor to free -
time: 0 ms
TID: 139699029014080 sleep +
TID: 139699029014080 Dts: 50 now: 15
TID: 139699029014080 Should sleep for: 35
TID: 139699029014080 sleep -
time: 35 ms
TID: 139699029014080 Processing() 2 frame -
Function time: 50ms

TID: 139699029014080 Processing() 3 frame +
TID: 139699029014080 parser->Read +
TID: 139699029014080 parser->Read -
time: 0 ms
TID: 139699029014080 parser->Get +
TID: 139699029014080 parser->Get -
time: 0 ms
TID: 139699029014080 parser->Analyze +
TID: 139699029014080 parser->Analyze -
time: 0 ms
TID: 139699029014080 decoder->Decode +
TID: 139699029014080 decoder->Decode -
time: 17 ms
TID: 139699029014080 check tensor to free +
TID: 139699029014080 check tensor to free -
time: 0 ms
TID: 139699029014080 sleep +
TID: 139699029014080 Dts: 100 now: 68
TID: 139699029014080 Should sleep for: 32
TID: 139699029014080 sleep -
time: 32 ms
TID: 139699029014080 Processing() 3 frame -
Function time: 49ms

TID: 139699029014080 Processing() 4 frame +
TID: 139699029014080 parser->Read +
TID: 139699029014080 parser->Read -
time: 0 ms
TID: 139699029014080 parser->Get +
TID: 139699029014080 parser->Get -
time: 0 ms
TID: 139699029014080 parser->Analyze +
TID: 139699029014080 parser->Analyze -
time: 0 ms
TID: 139699029014080 decoder->Decode +
TID: 139699029014080 decoder->Decode -
time: 18 ms
TID: 139699029014080 check tensor to free +
TID: 139699029014080 check tensor to free -
time: 0 ms
TID: 139699029014080 sleep +
TID: 139699029014080 Dts: 150 now: 118
TID: 139699029014080 Should sleep for: 32
TID: 139699029014080 sleep -
time: 32 ms
TID: 139699029014080 Processing() 4 frame -
Function time: 50ms

TID: 139699029014080 Processing() 5 frame +
TID: 139699029014080 parser->Read +
TID: 139699029014080 parser->Read -
time: 0 ms
TID: 139699029014080 parser->Get +
TID: 139699029014080 parser->Get -
time: 0 ms
TID: 139699029014080 parser->Analyze +
TID: 139699029014080 parser->Analyze -
time: 0 ms
TID: 139699029014080 decoder->Decode +
TID: 139699029014080 decoder->Decode -
time: 17 ms
TID: 139699029014080 check tensor to free +
TID: 139699029014080 check tensor to free -
time: 0 ms
TID: 139699029014080 sleep +
TID: 139699029014080 Dts: 199 now: 168
TID: 139699029014080 Should sleep for: 31
TID: 139699029014080 sleep -
time: 31 ms
TID: 139699029014080 Processing() 5 frame -
Function time: 48ms

TID: 139699029014080 Processing() 6 frame +
TID: 139699029014080 parser->Read +
TID: 139699029014080 parser->Read -
time: 0 ms
TID: 139699029014080 parser->Get +
TID: 139699029014080 parser->Get -
time: 0 ms
TID: 139699029014080 parser->Analyze +
TID: 139699029014080 parser->Analyze -
time: 0 ms
TID: 139699029014080 decoder->Decode +
TID: 139699029014080 decoder->Decode -
time: 18 ms
TID: 139699029014080 check tensor to free +
TID: 139699029014080 check tensor to free -
time: 0 ms
TID: 139699029014080 sleep +
TID: 139699029014080 Dts: 249 now: 217
TID: 139699029014080 Should sleep for: 32
TID: 139705051231360 vpp->Convert -
time: 239 ms
TID: 139705051231360 tensor->ConvertFromBlob +
TID: 139705051231360 tensor->ConvertFromBlob -
time: 0 ms
TID: 139705051231360 add tensor +
TID: 139705051231360 add tensor -
time: 0 ms
TID: 139705051231360 GetFrame() 1 frame -
Function time: 347ms

TID: 139705051231360 dumpFrame() +
TID: 139705051231360 Error status != 0, status: 700
TID: 139705051231360 src/VideoProcessor.cpp DumpFrame 69
TID: 139705051231360 dumpFrame() -
Function time: 0ms

TID: 139705051231360 GetFrame() +
TID: 139705051231360 findFree decoded frame +
TID: 139705051231360 findFree decoded frame -
time: 0 ms
TID: 139705051231360 findFree converted frame +
TID: 139705051231360 findFree converted frame -
time: 0 ms
TID: 139705051231360 decoder->GetFrame +
TID: 139705051231360 decoder->GetFrame -
time: 0 ms
TID: 139705051231360 vpp->Convert +
TID: 139705051231360 Error status != 0, status: 700
TID: 139705051231360 src/Wrappers/WrapperPython.cpp getFrame 313
Bad things happened: 700
Frame size:  (3840, 2160)
FPS:  20.0
Tensor shape: torch.Size([2160, 3840, 3])
Tensor dtype: torch.uint8
Tensor device: cuda:0
TID: 139705051231360 End processing async part
TID: 139699029014080 sleep -
time: 32 ms
TID: 139699029014080 Processing() 6 frame -
Function time: 50ms

TID: 139699029014080 Processing was interrupted or stream has ended
TID: 139699029014080 All consumers were notified about processing end
TID: 139705051231360 End processing sync part start
TID: 139705051231360 End processing sync part end
AlexanderNevarko commented 7 months ago

Any help here please?

BykadorovR commented 7 months ago

It seems like something is wrong with your CUDA/installation, unfortunately, there are no additional details in your logs except this 700 error. So there is not that much I can do/suggest.