edusense / edusense

EduSense: Practical Classroom Sensing at Scale
https://www.edusense.io
BSD 3-Clause "New" or "Revised" License
57 stars 11 forks source link

support openpose v1.6.0 release #10

Open dohyunkim-dev opened 4 years ago

dohyunkim-dev commented 4 years ago

We have openpose v1.5.1 submodule on our codebase, but recent release of Openpose v1.6.0 breaks our Docker build.

Error log:

Step 16/28 : RUN cmake /openpose && make -j 8 && make install
**OMITTED**
-- Found glog    (include: /usr/include, library: /usr/lib/x86_64-linux-gnu/libglog.so)
-- Found Protobuf: /usr/lib/x86_64-linux-gnu/libprotobuf.so;-lpthread (found version "3.0.0")
-- Found OpenCV: /usr (found version "3.2.0")
-- Caffe will be downloaded from source now. NOTE: This process might take several minutes depending
        on your internet connection.
-- Caffe has already been downloaded.
fatal: not a git repository: /openpose/3rdparty/caffe/../../../../../../.git/modules/compute/openpose/third_party/openpose/modules/3rdparty/caffe
-- Caffe will be built from source now.
-- Download the models.
-- Downloading BODY_25 model...
-- NOTE: This process might take several minutes depending on your internet connection.
CMake Error at cmake/Utils.cmake:8 (file):
  file DOWNLOAD HASH mismatch
    for file: [/openpose/models/pose/body_25/pose_iter_584000.caffemodel]
      expected hash: [78287b57cf85fa89c03f1393d368e5b7]
        actual hash: [d41d8cd98f00b204e9800998ecf8427e]
             status: [22;"HTTP response code said error"]
Call Stack (most recent call first):
  CMakeLists.txt:976 (download_model)
-- Not downloading body (COCO) model
-- Not downloading body (MPI) model
-- Downloading face model...
-- NOTE: This process might take several minutes depending on your internet connection.
CMake Error at cmake/Utils.cmake:8 (file):
  file DOWNLOAD HASH mismatch
    for file: [/openpose/models/face/pose_iter_116000.caffemodel]
      expected hash: [e747180d728fa4e4418c465828384333]
        actual hash: [d41d8cd98f00b204e9800998ecf8427e]
             status: [22;"HTTP response code said error"]
Call Stack (most recent call first):
  CMakeLists.txt:982 (download_model)
-- Downloading hand model...
-- NOTE: This process might take several minutes depending on your internet connection.
-- Models Downloaded.
CMake Error at cmake/Utils.cmake:8 (file):
  file DOWNLOAD HASH mismatch
    for file: [/openpose/models/hand/pose_iter_102000.caffemodel]
      expected hash: [a82cfc3fea7c62f159e11bd3674c1531]
        actual hash: [d41d8cd98f00b204e9800998ecf8427e]
             status: [22;"HTTP response code said error"]
Call Stack (most recent call first):
  CMakeLists.txt:984 (download_model)
-- Configuring incomplete, errors occurred!
See also "/app/build/CMakeFiles/CMakeOutput.log".
See also "/app/build/CMakeFiles/CMakeError.log".
The command '/bin/sh -c cmake /openpose && make -j 8 && make install' returned a non-zero code: 1

Potential Solution

There are two potential solutions:

  1. update submodule to openpose follow 1.6.0 and update our patch files at https://github.com/edusense/edusense/tree/master/compute/openpose/edusense. You may find some guidance at https://github.com/edusense/edusense/blob/master/compute/openpose/README.md to figure out how to create a patch file, etc.
  2. figure out whether openpose team allows us to download parameter files at specific version. (namely, v1.5.0)
JamKelley22 commented 4 years ago

Is there any work being done on this? I admit that I may be confused but the "<new edusense file>" in the doc of the openpose readme is not on the edusense repo and thus can't be used to create the diff file? Am I understanding this problem correctly? Alternatively, is there a workaround to get the Docker build running again?

pranavdheer commented 4 years ago

@JamKelley22, I apologise for the late response. You are right "new Edusense file" is not in the repo. But given the patch and the original file , it's quite straightforward to retrieve the "new Edusense file". Simply enter the command

patch <original file> -i edusense.cpp.patch -o new_edusense_file.cpp

Note-: the original file is /openpose/examples/tutorial_api_cpp/16_synchronous_custom_output.cpp and the edusense.cpp.patch is in /edusense/compute/openpose/edusense/edusense.cpp.patch I am working on Openpose v1.6.0 Integeration , may I suggest that you stick to the current 1.5.1 OP release while we fix this

JamKelley22 commented 4 years ago

Thanks for the response @pranavdheer. I suppose I'm mostly confused since I'm still recieving this error when running the docker build command on the openpose folder. It seems that the master branch is already targeting the openpose 1.5.1 release correct? Am I still recieving this error because I'm doing something incorectly? I believe ive followed the other getting started directions.

pranavdheer commented 4 years ago

Yes, you are correct. Master branch is targeting OpenPose 1.5.1 release. Could you please share your error log and help me walk through the steps you followed. I am assuming you are using Linux and have already installed Nvidia-docker as mentioned in dependencies. Note that OpenPose is a submodule and it is required that after cloning the Edusense repo you run the below-mentioned command to get Openpose submodule git submodule update --init --recursive

JamKelley22 commented 4 years ago

Using Ubuntu 18.04. Cloned the repo using the --recursive flag Installed nvidia-docker and docker-compose as the requirements said Created the input and output directories under /compose Placed the video.avi file under ./input

Running sudo LOCAL_USER_ID=$(id -u) docker-compose -f docker-compose.compute.file.yml build

... Previous steps use cache ...

Step 16/28 : RUN cmake /openpose && make -j 8 && make install ---> Running in 31b307940764 -- The C compiler identification is GNU 7.5.0 -- The CXX compiler identification is GNU 7.5.0 -- Check for working C compiler: /usr/bin/cc -- Check for working C compiler: /usr/bin/cc -- works -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Detecting C compile features -- Detecting C compile features - done -- Check for working CXX compiler: /usr/bin/c++ -- Check for working CXX compiler: /usr/bin/c++ -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Detecting CXX compile features -- Detecting CXX compile features - done -- GCC detected, adding compile flags -- GCC detected, adding compile flags -- Looking for pthread.h -- Looking for pthread.h - found -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed -- Looking for pthread_create in pthreads -- Looking for pthread_create in pthreads - not found -- Looking for pthread_create in pthread -- Looking for pthread_create in pthread - found -- Found Threads: TRUE
-- Found CUDA: /usr/local/cuda (found version "10.0") -- Building with CUDA. -- CUDA detected: 10.0 -- Found cuDNN: ver. 7.6.5 found (include: /usr/include, library: /usr/lib/x86_64-linux-gnu/libcudnn.so) -- Automatic GPU detection failed. Building for all known architectures. -- Added CUDA NVCC flags for: sm_30 sm_35 sm_37 sm_50 sm_52 sm_53 sm_60 sm_61 sm_62 sm_70 sm_75 -- Found cuDNN: ver. 7.6.5 found (include: /usr/include, library: /usr/lib/x86_64-linux-gnu/libcudnn.so) -- Found GFlags: /usr/include
-- Found gflags (include: /usr/include, library: /usr/lib/x86_64-linux-gnu/libgflags.so) -- Found Glog: /usr/include
-- Found glog (include: /usr/include, library: /usr/lib/x86_64-linux-gnu/libglog.so) -- Found Protobuf: /usr/lib/x86_64-linux-gnu/libprotobuf.so;-lpthread (found version "3.0.0") -- Found OpenCV: /usr (found version "3.2.0") -- Caffe will be downloaded from source now. NOTE: This process might take several minutes depending on your internet connection. -- Caffe has already been downloaded. fatal: not a git repository: /openpose/3rdparty/caffe/../../../../../../.git/modules/compute/openpose/third_party/openpose/modules/3rdparty/caffe -- Caffe will be built from source now. -- Download the models. -- Downloading BODY_25 model... -- NOTE: This process might take several minutes depending on your internet connection.

...

Step 28/28 : ENTRYPOINT ["./entrypoint.sh"] ---> Running in 3985c6013f7d Removing intermediate container 3985c6013f7d ---> 6dc4b0916684

Successfully built 6dc4b0916684 Successfully tagged compose_openpose:latest

Then running sudo LOCAL_USER_ID=$(id -u) docker-compose -f docker-compose.compute.file.yml up

Produces the following error

Creating network "compose_edusense_backend" with driver "bridge" Creating compose_video_1 ... done Creating compose_openpose_1 ... error

ERROR: for compose_openpose_1 Cannot start service openpose: OCI runtime create failed: unable to retrieve OCI runtime error (open /run/containerd/io.containerd.runtime.v1.linux/moby/4f063e3b9d50af0f063776b21fd5fbd8d023c9423bf2f12c5536feaf133479d9/log.json: no such file or directory): fork/exec /usr/bin/nvidia-container-runtime: no such file or directory: unknown

ERROR: for openpose Cannot start service openpose: OCI runtime create failed: unable to retrieve OCI runtime error (open /run/containerd/io.containerd.runtime.v1.linux/moby/4f063e3b9d50af0f063776b21fd5fbd8d023c9423bf2f12c5536feaf133479d9/log.json: no such file or directory): fork/exec /usr/bin/nvidia-container-runtime: no such file or directory: unknown ERROR: Encountered errors while bringing up the project.

JamKelley22 commented 4 years ago

Is this error related to this issue or have I mistaken the error log? If so could you point me to where I may be going wrong if you can glean that from the log provided?

pranavdheer commented 4 years ago

Seems like we can build the individual audio and video pipelines and try to isolate the error you are facing. meanwhile, We are working on the docker-compose issue

Step 1-: Go to edusense/compute/openpose/ docker build . -t edusense/openpose:Developer

Step 2-: Go to edusense/compute/audio/ docker build . -t edusense/audio:Developer

Step 3 -: Go to edusense/compute/video/ docker build . -t edusense/video:Developer

could you please let me know, it this runs successfully?

pranavdheer commented 4 years ago

Hi JamKelley22 Seems like I am able to replicate the issue. The error says that you may have uninstalled Nvidia-container-runtime which is the runtime env used by the OpenPose pipeline. You may want to install it once again. May I recommend that you have a look at https://github.com/NVIDIA/nvidia-docker/issues/686 if you run the command dpkg -l '*nvidia*'. you may notice that there is no Nvidia run time environment. You can post a screenshot of the command if you would like me to have a look.

please, let me know if this works for you? Though I noticed some other problem with the compose (not related to what you are facing, expect a commit solving that in the next few days)

JamKelley22 commented 4 years ago

Hi @pranavdheer Here is a complete log for the requested commands. I dont wanna clog up the issue so I made it a gist.
I see in dkpg_log.txt, nvidia-docker lists none as a version but nvidia-docker2 lists 2.5.0-1 as its version. This should mean that it is still installed correct?

pranavdheer commented 4 years ago

could you give me log of dpkg -l '*nvidia*' . I suspect that Nvidia-docker may have been installed correctly but the run-time environment could be missing While I go through the log, could you also do a fresh install of Nvidia-docker

JamKelley22 commented 4 years ago

That is included in my previous gist. If you search for "dkpg_log.txt" it's about halfway down the page.

mudar-memari-cmu commented 3 years ago

@pranavdheer @JamKelley22 Didn't we fix this a long time ago? Or is this still an issue to address?

JamKelley22 commented 3 years ago

I was able to fix the issue I was facing before. However, the openpose module being using in the master branch is still v1.5.1. Despite seeing parts of this error it still builds the container.

mudar-memari-cmu commented 3 years ago

OpenPose v1.7.0 gineshidalgo99 released this on Nov 17, 2020