snowzach / doods

DOODS - Dedicated Open Object Detection Service
MIT License
304 stars 31 forks source link

Jetson Nano #25

Closed sakalauskas closed 3 years ago

sakalauskas commented 4 years ago

It took me quite some time to add support for Jetson Nano, so I thought I would share my progress. This isn't an ideal/complete solution, maybe someone could build upon this or reuse this. Using this docker image detection time decreased from ~4 seconds to ~ 1 second using the faster_rcnn_inception_v2_coco_2018_01_28 model as the processing was offloaded to GPU.

  1. I was not able to build bazel on nvcr.io/nvidia/l4t-base image so I thought I just use pre-built binaries where Bazel is needed.
  2. It uses Nvidia pre-compiled binaries for Tensorflow and Tensorflow C library built by photoprism.org.
  3. It does not include the TensorflowLite C binary. 3.1. For the doods to compile doods/detector/detector.go needs to be modified and references to Tensorflow Lite should be removed/commented before building the image. Tensorflow Lite is not really needed for Jetson Nano as we can just use Tensorflow so I did not bother adding Tensorflow Lite support.
  4. It took about 5 hours to build the image.
  5. I had to install Cuda 10.0 libraries although the latest Jetpack ships with Cuda 10.2. TensorFlow C library was compiled with Cuda 10.0 so we need to downgrade.
  6. I have attached the system libraries as volumes for the Docker to see. I am not sure if this is the right way to do it, e.g.:
     volumes:
      - /usr/local/cuda-10.0/targets/aarch64-linux/lib:/usr/local/cuda-10.0/lib64
      - /usr/lib/aarch64-linux-gnu:/usr/lib/aarch64-linux-gnu
  7. Nvidia runtime should be set as default

Here is the Dockerfile to build the image.

# 32.3.1 is the last version that includes Cuda 10.0
FROM nvcr.io/nvidia/l4t-base:r32.3.1

RUN apt-get update -y
RUN DEBIAN_FRONTEND=noninteractive apt-get install libhdf5-serial-dev hdf5-tools libhdf5-dev zlib1g-dev zip libjpeg8-dev liblapack-dev libblas-dev gfortran -y
RUN apt-get install python3-pip -y
RUN pip3 install -U pip testresources setuptools
RUN DEBIAN_FRONTEND=noninteractive apt-get install python3 python-dev python3-dev build-essential libssl-dev libffi-dev libxml2-dev libxslt1-dev zlib1g-dev -yq

RUN pip3 install -U numpy==1.16.1 future==0.17.1 mock==3.0.5 h5py==2.9.0 keras_preprocessing==1.0.5 keras_applications==1.0.8 gast==0.2.2 futures protobuf pybind11

RUN pip3 install --pre --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v44 'tensorflow<2'

# Install reqs with cross compile support
RUN dpkg --add-architecture arm64 && \
    apt-get update && apt-get install -y --no-install-recommends \
    pkg-config zip zlib1g-dev unzip wget bash-completion git curl \
    build-essential patch g++ python python-future python-numpy python-six python3 \
    cmake ca-certificates \
    libc6-dev:arm64 libstdc++6:arm64 libusb-1.0-0:arm64

# Install protoc
RUN wget https://github.com/protocolbuffers/protobuf/releases/download/v3.9.1/protoc-3.9.1-linux-x86_64.zip && \
    unzip protoc-3.9.1-linux-x86_64.zip -d /usr/local && \
    rm /usr/local/readme.txt && \
    rm protoc-3.9.1-linux-x86_64.zip

RUN apt-get update && apt-get install -y --no-install-recommends \
    pkg-config zip zlib1g-dev unzip wget bash-completion git curl \
    build-essential patch g++ python python-future python3 ca-certificates \
    libc6-dev libstdc++6 libusb-1.0-0 xz-utils

# Download and configure the build environment for gcc 6 which is needed to compile everything else
RUN mkdir -p /tmp/sysroot/lib && mkdir -p /tmp/sysroot/usr/lib && \
    cd /tmp && \
    wget --no-check-certificate https://releases.linaro.org/components/toolchain/binaries/6.3-2017.05/aarch64-linux-gnu/gcc-linaro-6.3.1-2017.05-x86_64_aarch64-linux-gnu.tar.xz -O /tmp/toolchain.tar.xz && \
    tar xf /tmp/toolchain.tar.xz && \
    rm toolchain.tar.xz && \
    cp -r /tmp/gcc-linaro-6.3.1-2017.05-x86_64_aarch64-linux-gnu/aarch64-linux-gnu/libc/* /tmp/sysroot/
RUN mkdir -p /tmp/debs && cd /tmp/debs && apt-get download libc6:arm64 libc6-dev:arm64 && \
    ar x libc6_*.deb && tar xvf data.tar.xz && \
    ar x libc6-dev*.deb && tar xvf data.tar.xz && \
    cp -R usr /tmp/sysroot && cp -R lib /tmp/sysroot && rm -Rf /tmp/debs && \
    mkdir -p /tmp/debs && cd /tmp/debs && \
    apt-get download libusb-1.0-0:arm64 libudev1:arm64 zlib1g-dev:arm64 zlib1g:arm64 && \
    ar x libusb-1.0*.deb && tar xvf data.tar.xz && \
    ar x libudev1*.deb && tar xvf data.tar.xz && \
    ar x zlib1g_*.deb && tar xvf data.tar.xz && \
    ar x zlib1g-dev*.deb && tar xvf data.tar.xz && rm usr/lib/aarch64-linux-gnu/libz.so && \
    cp -r lib/aarch64-linux-gnu/* /tmp/sysroot/lib && \
    cp -r usr/lib/aarch64-linux-gnu/* /tmp/sysroot/usr/lib && \
    cp -r usr/include/* /tmp/sysroot/usr/include && \
    ln -rs /tmp/sysroot/lib/libusb-1.0.so.0.1.0 /tmp/sysroot/lib/libusb-1.0.so && \
    ln -rs /tmp/sysroot/lib/libudev.so.1.6.13 /tmp/sysroot/lib/libudev.so && \
    ln -rs /tmp/sysroot/lib/libz.so.1.2.11 /tmp/sysroot/lib/libz.so && \
    ln -s /usr/local /tmp/sysroot/usr/local && \
    cd /tmp && rm -Rf /tmp/debs

ENV CC="/tmp/gcc-linaro-6.3.1-2017.05-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-gcc"
ENV CXX="/tmp/gcc-linaro-6.3.1-2017.05-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-g++"
ENV LDFLAGS="-v -L /lib -L /usr/lib --sysroot /tmp/sysroot"
ENV CFLAGS="-L /lib -L /usr/lib --sysroot /tmp/sysroot"
ENV CXXFLAGS="-L /lib -L /usr/lib --sysroot /tmp/sysroot"

# Install GOCV
ARG OPENCV_VERSION="4.1.2"
ENV OPENCV_VERSION $OPENCV_VERSION
RUN cd /tmp && \
    curl -Lo opencv.zip https://github.com/opencv/opencv/archive/${OPENCV_VERSION}.zip && \
    unzip -q opencv.zip && \
    curl -Lo opencv_contrib.zip https://github.com/opencv/opencv_contrib/archive/${OPENCV_VERSION}.zip && \
    unzip -q opencv_contrib.zip && \
    rm opencv.zip opencv_contrib.zip && \
    cd opencv-${OPENCV_VERSION} && \
    mkdir build && cd build && \
    cmake -D CMAKE_BUILD_TYPE=RELEASE \
    -D CMAKE_INSTALL_PREFIX=/usr/local \
    -D OPENCV_EXTRA_MODULES_PATH=../../opencv_contrib-${OPENCV_VERSION}/modules \
    -D CMAKE_TOOLCHAIN_FILE=/tmp/opencv-${OPENCV_VERSION}/platforms/linux/aarch64-gnu.toolchain.cmake \
    -D WITH_CUDA=ON \
    -D ENABLE_FAST_MATH=1 \
    -D CUDA_FAST_MATH=1 \
    -D WITH_CUBLAS=1 \
    -D WITH_JASPER=OFF \
    -D WITH_QT=OFF \
    -D WITH_GTK=OFF \
    -D WITH_IPP=OFF \
    -D BUILD_DOCS=OFF \
    -D BUILD_EXAMPLES=OFF \
    -D BUILD_IPP_IW=OFF \
    -D BUILD_TESTS=OFF \
    -D BUILD_PERF_TESTS=OFF \
    -D BUILD_opencv_java=NO \
    -D BUILD_opencv_python=NO \
    -D BUILD_opencv_python2=NO \
    -D BUILD_opencv_python3=NO \
    -D OPENCV_GENERATE_PKGCONFIG=ON .. && \
    make -j $(nproc --all) && \
    make preinstall && make install && \
    cd /tmp && rm -rf opencv*

# Configure the Go version to be used
ENV GO_ARCH "arm64"
ENV GOARCH=arm64

# Install Go
ENV GO_VERSION "1.14.2"
RUN curl -kLo go${GO_VERSION}.linux-${GO_ARCH}.tar.gz https://dl.google.com/go/go${GO_VERSION}.linux-${GO_ARCH}.tar.gz && \
    tar -C /usr/local -xzf go${GO_VERSION}.linux-${GO_ARCH}.tar.gz && \
    rm go${GO_VERSION}.linux-${GO_ARCH}.tar.gz

RUN apt-get update && apt-get install -y --no-install-recommends \
    pkg-config zip zlib1g-dev unzip wget bash-completion git curl \
    build-essential patch g++ python python-future python3 ca-certificates \
    libc6-dev libstdc++6 libusb-1.0-0

ENV GOOS=linux
ENV CGO_ENABLED=1
ENV PATH /usr/local/go/bin:/go/bin:${PATH}
ENV GOPATH /go

# Create the build directory
RUN mkdir /build
WORKDIR /build

ENV CC=aarch64-linux-gnu-gcc
ENV CXX=aarch64-linux-gnu-g++

ENV LD_LIBRARY_PATH "/usr/local/cuda-10.2/lib64:/usr/local/lib:${PATH}"
ENV PATH="/usr/local/cuda-10.2/bin:/usr/local/cuda/bin:${PATH}"

# Install pre-compiled Tensorflow Go C bindings
RUN mkdir /tmp/libtensorflow && cd /tmp/libtensorflow && \
    wget https://dl.photoprism.org/tensorflow/nvidia-jetson/libtensorflow-jetson-nano-1.15.2.tar.gz && \
    tar xvzf libtensorflow-jetson-nano-1.15.2.tar.gz && \
    cd lib && \
    cp libtensorflow_framework.so /usr/local/lib/libtensorflow_framework.so.1 && \
    cp libtensorflow_framework.so /usr/local/lib/libtensorflow_framework.so && \
    cp libtensorflow.so /usr/local/lib/libtensorflow.so && \
    rm -rf /tmp/libtensorflow

RUN ldconfig
ADD . .
RUN make
RUN ls -la  /usr/local/lib

RUN apt-get update && \
    apt-get install -y --no-install-recommends libusb-1.0 libc++-7-dev wget unzip ca-certificates libdc1394-22 libavcodec57 libavformat57 && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*
RUN mkdir -p /opt/doods
WORKDIR /opt/doods
#COPY --from=builder /usr/local/lib/. /usr/local/lib/.
#COPY --from=builder /build/doods /opt/doods/doods
RUN cp -R /build/doods /opt/doods/doods
ADD config.yaml /opt/doods/config.yaml
RUN ldconfig

RUN mkdir models
RUN wget https://storage.googleapis.com/download.tensorflow.org/models/tflite/coco_ssd_mobilenet_v1_1.0_quant_2018_06_29.zip && unzip coco_ssd_mobilenet_v1_1.0_quant_2018_06_29.zip && rm coco_ssd_mobilenet_v1_1.0_quant_2018_06_29.zip && mv detect.tflite models/coco_ssd_mobilenet_v1_1.0_quant.tflite && rm labelmap.txt
RUN wget https://dl.google.com/coral/canned_models/coco_labels.txt && mv coco_labels.txt models/coco_labels0.txt

RUN ls -la  /usr/lib/aarch64-linux-gnu
ENV LD_LIBRARY_PATH "/usr/lib/aarch64-linux-gnu:/usr/local/cuda-10.0/lib64:/usr/local/cuda/lib64:/usr/local/lib:${LD_LIBRARY_PATH}"

CMD ["/opt/doods/doods", "-c", "/opt/doods/config.yaml", "api"]

# run with docker run -it --runtime=nvidia -v /opt/doods/models:/opt/doods/models -v /opt/doods/config.yaml:/opt/doods/config.yaml -v /usr/local/cuda-10.0/targets/aarch64-linux/lib:/usr/local/cuda-10.0/lib64 -v /usr/lib/aarch64-linux-gnu:/usr/lib/aarch64-linux-gnu  -p 8080:8080 helix3/doods:jetsonnano

To build it:

RichardPar commented 4 years ago

@sakalauskas Awesome - I dont know how it gets down to 30 frames per second on normal CUDA using the jetson reference apps... but I really like this :D

Regards, Richard

snowzach commented 3 years ago

Good stuff! Thanks!

eloebl commented 3 years ago

Has anyone tried this recently or I may be missing a step? Tested on a NVIDIA Jetson Nano 2GB Developer Kit and getting the following:

Step 40/53 : RUN make
 ---> Running in 5e7e4c0843a2
GO111MODULE=off go get github.com/gogo/protobuf/proto
go: downloading google.golang.org/grpc v1.30.0
go: downloading github.com/spf13/cobra v1.0.0
go: downloading golang.org/x/image v0.0.0-20200618115811-c13761719519
go: downloading github.com/lmittmann/ppm v1.0.0
go: downloading github.com/spf13/viper v1.7.0
go: downloading golang.org/x/net v0.0.0-20200602114024-627f9648deb9
go: downloading github.com/grpc-ecosystem/grpc-gateway v1.14.6
go: downloading github.com/pelletier/go-toml v1.8.0
go: downloading github.com/grpc-ecosystem/go-grpc-middleware v1.2.0
go: downloading github.com/hashicorp/hcl v1.0.0
go: downloading github.com/snowzach/certtools v1.0.2
go: downloading github.com/fsnotify/fsnotify v1.4.9
go: downloading github.com/gogo/protobuf v1.3.1
go: downloading github.com/go-chi/cors v1.1.1
go: downloading gopkg.in/yaml.v2 v2.3.0
go: downloading github.com/tensorflow/tensorflow v2.0.0+incompatible
go: downloading github.com/golang/protobuf v1.4.2
go: downloading github.com/subosito/gotenv v1.2.0
go: downloading gopkg.in/ini.v1 v1.57.0
go: downloading github.com/mitchellh/mapstructure v1.3.2
go: downloading github.com/spf13/pflag v1.0.5
go: downloading github.com/go-chi/chi v4.1.2+incompatible
go: downloading github.com/spf13/jwalterweatherman v1.1.0
go: downloading google.golang.org/genproto v0.0.0-20200623002339-fbb79eadd5eb
go: downloading github.com/go-chi/render v1.0.1
go: downloading google.golang.org/protobuf v1.24.0
go: downloading go.uber.org/zap v1.15.0
go: downloading github.com/spf13/cast v1.3.1
go: downloading golang.org/x/sys v0.0.0-20200622214017-ed371f2e16b4
go: downloading golang.org/x/text v0.3.3
go: downloading github.com/blendle/zapdriver v1.3.1
go: downloading go.uber.org/atomic v1.6.0
go: downloading github.com/magiconair/properties v1.8.1
go: downloading github.com/spf13/afero v1.3.0
go: downloading go.uber.org/multierr v1.5.0
go get github.com/gogo/protobuf/protoc-gen-gogoslick
go: found github.com/gogo/protobuf/protoc-gen-gogoslick in github.com/gogo/protobuf v1.3.1
go get github.com/grpc-ecosystem/grpc-gateway/protoc-gen-grpc-gateway
go: downloading github.com/grpc-ecosystem/grpc-gateway v1.16.0
go: found github.com/grpc-ecosystem/grpc-gateway/protoc-gen-grpc-gateway in github.com/grpc-ecosystem/grpc-gateway v1.16.0
go: downloading github.com/golang/glog v0.0.0-20160126235308-23def4e6c14b
go: downloading github.com/ghodss/yaml v1.0.0
go: github.com/grpc-ecosystem/grpc-gateway upgrade => v1.16.0
go get github.com/grpc-ecosystem/grpc-gateway/protoc-gen-swagger
go: downloading google.golang.org/grpc v1.33.1
go: downloading golang.org/x/net v0.0.0-20200822124328-c89045814202
go: found github.com/grpc-ecosystem/grpc-gateway/protoc-gen-swagger in github.com/grpc-ecosystem/grpc-gateway v1.16.0
# Compiling...
go build -ldflags "-X github.com/snowzach/doods/conf.Executable=doods -X github.com/snowzach/doods/conf.GitVersion=v0.2.5-2-gedd6f5d-dirty" -o doods
# github.com/snowzach/doods/detector
detector/regions.go:50:13: region.Covers undefined (type *odrpc.DetectRegion has no field or method Covers)
Makefile:42: recipe for target 'doods' failed
make: *** [doods] Error 2
The command '/bin/sh -c make' returned a non-zero code: 2

Thanks!

sakalauskas commented 3 years ago

@mloebl I think you might have deleted too many things in detector.go

Only three lines need to be commented out (e.g. see below, keep in mind that this file might have changed in the latest doods version as at the time of writing I was on 2a850c99740c30b19ba7d653751d5174f0a0253c HEAD)

package detector

import (
    "context"
    "sync"

    // We will support these formats
    _ "image/gif"
    _ "image/jpeg"
    _ "image/png"

    _ "github.com/lmittmann/ppm"
    _ "golang.org/x/image/bmp"

    emptypb "github.com/golang/protobuf/ptypes/empty"
    config "github.com/spf13/viper"
    "go.uber.org/zap"
    "google.golang.org/grpc/codes"
    "google.golang.org/grpc/status"

    "github.com/snowzach/doods/conf"
    "github.com/snowzach/doods/detector/dconfig"
    "github.com/snowzach/doods/detector/tensorflow"
//  "github.com/snowzach/doods/detector/tflite"
    "github.com/snowzach/doods/odrpc"
)

// Detector is the interface to object detectors
type Detector interface {
    Config() *odrpc.Detector
    Detect(ctx context.Context, request *odrpc.DetectRequest) (*odrpc.DetectResponse, error)
    Shutdown()
}

// Mux handles and routes requests to the configured detectors
type Mux struct {
    detectors map[string]Detector
    authKey   string
    logger    *zap.SugaredLogger
}

// Create a new mux
func New() *Mux {

    m := &Mux{
        detectors: make(map[string]Detector),
        authKey:   config.GetString("doods.auth_key"),
        logger:    zap.S().With("package", "detector"),
    }

    // Get the detectors config
    var detectorConfig []*dconfig.DetectorConfig
    config.UnmarshalKey("doods.detectors", &detectorConfig)

    // Create the detectors
    for _, c := range detectorConfig {
        var d Detector
        var err error

        m.logger.Debugw("Configuring detector", "config", c)

        switch c.Type {
//      case "tflite":
//          d, err = tflite.New(c)
        case "tensorflow":
            d, err = tensorflow.New(c)
        default:
            m.logger.Errorw("Could not initialize detector", "name", c.Name, "type", c.Type)
            continue
        }

        if err != nil {
            m.logger.Errorf("Could not initialize detector %s: %v", c.Name, err)
            continue
        }

        dc := d.Config()
        m.logger.Infow("Configured Detector", "name", dc.Name, "type", dc.Type, "model", dc.Model, "labels", len(dc.Labels), "width", dc.Width, "height", dc.Height)
        m.detectors[c.Name] = d
    }

    if len(m.detectors) == 0 {
        m.logger.Fatalf("No detectors configured")
    }

    return m

}

// GetDetectors returns the configured detectors
func (m *Mux) GetDetectors(ctx context.Context, _ *emptypb.Empty) (*odrpc.GetDetectorsResponse, error) {
    detectors := make([]*odrpc.Detector, 0)
    for _, d := range m.detectors {
        detectors = append(detectors, d.Config())
    }
    return &odrpc.GetDetectorsResponse{
        Detectors: detectors,
    }, nil
}

// Shutdown deallocates/shuts down any detectors
func (m *Mux) Shutdown() {
    for _, d := range m.detectors {
        d.Shutdown()
    }
}

// Run a detection
func (m *Mux) Detect(ctx context.Context, request *odrpc.DetectRequest) (*odrpc.DetectResponse, error) {

    if request.DetectorName == "" {
        request.DetectorName = "default"
    }

    detector, ok := m.detectors[request.DetectorName]
    if !ok {
        return nil, status.Errorf(codes.NotFound, "not found")
    }

    return detector.Detect(ctx, request)

}

// Handle a stream of detections
func (m *Mux) DetectStream(stream odrpc.Odrpc_DetectStreamServer) error {

    // Handle cancel
    ctx, cancel := context.WithCancel(stream.Context())
    go func() {
        select {
        case <-ctx.Done():
        case <-conf.Stop.Chan():
            cancel()
        }
    }()

    var send sync.Mutex
    var ret error
    for ctx.Err() == nil {

        request, err := stream.Recv()
        if err != nil {
            return nil
        }

        m.logger.Info("Stream Request")

        go func(request *odrpc.DetectRequest) {

            response, err := m.Detect(ctx, request)
            if err != nil {
                // A non-fatal error
                if status.Code(err) == codes.Internal {
                    send.Lock()
                    ret = err
                    cancel()
                    send.Unlock()
                    return
                } else {
                    response = &odrpc.DetectResponse{
                        Id:    request.Id,
                        Error: err.Error(),
                    }
                }
            }

            send.Lock()
            stream.Send(response)
            send.Unlock()

        }(request)

    }

    return ret

}
eloebl commented 3 years ago

@sakalauskas Thank you for the quick reply! Same error with the file from you, so may very well be a version issue, I'll try checking out that version specifically and see what happens.

eloebl commented 3 years ago

That was it! 2a850c9 built fine for me, thank you! Now to start playing with it :)

eloebl commented 3 years ago

@sakalauskas Hopefully last question, I got it up and running using the config.yaml:

  detectors:
    - name: default
      type: tensorflow
      modelFile: models/faster_rcnn_inception_v2_coco_2018_01_28.pb
      labelFile: models/coco_labels1.txt
      numThreads: 1
      numConcurrent: 1

Looks like I may be running out of memory on the GPU?

2020-11-17 18:11:23.953762: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1325] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 48 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X1, pci bus id: 0000:00:00.0, compute capability: 5.3)
2020-11-17 18:11:48.557262: W tensorflow/core/common_runtime/bfc_allocator.cc:419] Allocator (GPU_0_bfc) ran out of memory trying to allocate 10.12MiB (rounded to 10616832).  Current allocation summary follows.

As it runs for about 25 seconds and then:

2020-11-17 18:11:48.575983: I tensorflow/core/common_runtime/bfc_allocator.cc:921] Sum Total of in-use chunks: 40.64MiB
2020-11-17 18:11:48.576005: I tensorflow/core/common_runtime/bfc_allocator.cc:923] total_region_allocated_bytes_: 50651136 memory_limit_: 50651136 available bytes: 0 curr_region_allocation_bytes_: 101302272
2020-11-17 18:11:48.576042: I tensorflow/core/common_runtime/bfc_allocator.cc:929] Stats: 
Limit:                    50651136
InUse:                    42619648
MaxInUse:                 42619648
NumAllocs:                     157
MaxAllocSize:             15925248

2020-11-17 18:11:48.576107: W tensorflow/core/common_runtime/bfc_allocator.cc:424] *************************************************************************************_______________
2020-11-17 18:11:48.576192: W tensorflow/core/framework/op_kernel.cc:1628] OP_REQUIRES failed at constant_op.cc:77 : Resource exhausted: OOM when allocating tensor of shape [3,3,576,512] and type float
2020-11-17 18:11:48.576274: E tensorflow/core/common_runtime/executor.cc:642] Executor failed to create kernel. Resource exhausted: OOM when allocating tensor of shape [3,3,576,512] and type float

Is there anything in the config.yaml I can adjust for this? Thank you!

sakalauskas commented 3 years ago

@mloebl Actually I get this error too (but I have 4GB RAM Jetson).

If the machine is loaded with other tasks and you start the docker image, it can't obtain the GPU memory. For some reason, on boot, the Tensorflow GPU device is being created correctly. So as long as the docker image is started at the startup, GPU should be created successfully - I have been running it for months with no issues:

doods    | 2020-11-17 18:27:00.453029: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1325] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1046 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X1, pci bus id: 0000:00:00.0, compute capability: 5.3)
eloebl commented 3 years ago

@sakalauskas Got it, this is the 2gb board, and even with gdm disabled, still getting only about 178mb free which I sounds like the issue. Checking on ram usage, doods is alone about 1gb, not counting the rest of the services running so that makes sense. Guessing even at 178mb still not enough for it as it's failing to process anything. Thanks again for the help, maybe I'll look into the Coral units as can just plug that into my HA box.

sakalauskas commented 3 years ago

@mloebl I think you should try adding swap, it should help a bit since doods eat a lot of RAM (currently the ubuntu running doods uses 2.8G/3.87G and swap 1.9G/5.96G)

RichardPar commented 3 years ago

I have taken inspiration and written a C++ version of DOODS that runs natively on a Jetson Nano. Its a proof-of-concept currently, and based on this fantastic project.

Running it on a 4GB Nano nvidiamem

It should run on a 2GB Nano

              total        used        free      shared  buff/cache   available
Mem:        4057992     1665448     1748220       44128      644324     2184256
Swap:       2028992       27676     2001316

https://github.com/RichardPar/JetsonCUDA_DOODS

Regards, Richard

llego commented 3 years ago

This is great! I managed to get DOODS up and running on my Jetson Nano using TensorFlow. I used @sakalauskas 's Docker file from the first post and the detector.go file a few posts down.

I'm running JetPack 4.3 (which comes with CUDA 10.0). I did the git pull from master: b2a1c53. Docker-compose enables a production-like object detection service for my Home Assistant server:

version: "3"
services:
  doods:
    image: llego/doods:jetsonnano
    container_name: doods
    volumes:
      - /opt/doods/models:/opt/doods/models
      - /opt/doods/config.yaml:/opt/doods/config.yaml
      - /usr/local/cuda-10.0/targets/aarch64-linux/lib:/usr/local/cuda-10.0/lib64
      - /usr/lib/aarch64-linux-gnu:/usr/lib/aarch64-linux-gnu
    ports:
      - 8080:8080
    restart: unless-stopped
bgulla commented 3 years ago

This is great! I managed to get DOODS up and running on my Jetson Nano using TensorFlow. I used @sakalauskas 's Docker file from the first post and the detector.go file a few posts down.

I'm running JetPack 4.3 (which comes with CUDA 10.0). I did the git pull from master: b2a1c53. Docker-compose enables a production-like object detection service for my Home Assistant server:

version: "3"
services:
  doods:
    image: llego/doods:jetsonnano
    container_name: doods
    volumes:
      - /opt/doods/models:/opt/doods/models
      - /opt/doods/config.yaml:/opt/doods/config.yaml
      - /usr/local/cuda-10.0/targets/aarch64-linux/lib:/usr/local/cuda-10.0/lib64
      - /usr/lib/aarch64-linux-gnu:/usr/lib/aarch64-linux-gnu
    ports:
      - 8080:8080
    restart: unless-stopped

just to save us all the steps, do you mind pushing your docker image to dockerhub?

llego commented 3 years ago

Actually I stopped using DOODS as I moved on to running Shinobi on my Jetson Nano. So I don't even have the docker image left. And probably there have been some changes in DOODS since I compiled it.

sakalauskas commented 3 years ago

@bgulla I moved away from DOODS as well and reinstalled Ubuntu on Jetson. I found https://github.com/blakeblackshear/frigate to be quite great. Sadly there is no hardware acceleration with Jetson yet.