OpenCV error loading images

andreasmarxer commented 4 years ago

I compiled the Makefile.txt with make successfully.

The loaded modules on the HPC cluster are:

But in training afterwards, I have an error loading the images, which only occur when I do have OpenCV enabled (OpenCV=1). The training output looks as follows when using the command:

./darknet detector test data/obj.data cfg/yolov3-obj.cfg backup/yolov3-obj_4000.weights -ext_output < data/validation_window1000.txt > result_test.txt

opencv_error_when_training

Without OpenCV it works. What could be the issue here? @AlexeyAB @cenit

AlexeyAB commented 4 years ago

Check that you have these images.

Also try with -ext_output -dont_show instead of -ext_output

andreasmarxer commented 4 years ago

Then only the cannot load image error appears: dontshow

AlexeyAB commented 4 years ago

So you don't have these images or don't have permissions.

andreasmarxer commented 4 years ago

I do have because when I compile it with OpenCV=0 it works perfectly fine.

The thing is that I want to predict and save all images in the validation.txt file. And actually, without OpenCV it predicts all, but always saves the images as predictions.jpg and therefore overwrites the previous image.

I guessed that with OpenCV enabled, I could see all these images from the validation while testing. Additional features as e.g. getting the loss curve during training would also be available with OpenCV.

AlexeyAB commented 4 years ago

Otherwise it seems that OpenCV can't load your images.

andreasmarxer commented 4 years ago

Yes, but what could be the reason for that?

And is there another way to predict all validation images and save them out of validation.txt without OpenCV?

andreasmarxer commented 4 years ago

I'm just looking for a way to predict a list of images and save them all with a with a different name e.g. prediciton_img1.jpg if the image name was img1 and this all without OpenCV. @AlexeyAB With the actual implementation, it always saves the image as predictions.jpg and overwrites them.

cenit commented 4 years ago

@andreasmarxer your opencv was built without jpg support. Strange but definitely possible.

It might be interesting for you to rebuild dependencies using vcpkg. Install it and then launch ./vcpkg install darknet[full]. It will install darknet with all of its dependencies, including CUDA and CUDNN. Please let me know if it doesn't work

andreasmarxer commented 4 years ago

Thank you for your answer! @cenit Okay, this might be an issue, I will check. The thing is that I'm working on a HPC cluster and I didn't install OpenCV by myself. I will check the permissions I have and try to install this package to build from. Keep you posted :)

cenit commented 4 years ago

you can always build packages in your personal folder, and vcpkg is made exactly for that. I remembered your HPC setup, that's why I was trying to suggest you this way :) (it's also possible that the hpc admin did really a bad mistake providing an opencv package without libjpeg support, so it might interest them too)

andreasmarxer commented 4 years ago

Okay, this sounds good! I did the following: 1) clone the git repo and go into this folder 2) ./bootstrap-vcpkg.sh 3) ./vcpkg integrate install 4) run ./vcpkg install darknet[full] still in the vcpkg folder

Then the output leads to an error:

vcpkg_1

cenit commented 4 years ago

The problem is that vcpkg is very bleeding edge.

The portfile for cuda (luckily it is very simple) expects cuda 10.1, while you have only 9.x

If you open it (ports/cuda/portfile.cmake) you should be able to tailor it to your version, allowing you to cheat. The subsequent problem unfortunately is CUDNN, which goes hand by hand with cuda. You should modify also that portfile accordingly. Then it should work. Please let me know if you encounter any other problem. In case of success, please share it the portfiles for cuda9+CUDNN, for future reference and people that might encounter your same problem

andreasmarxer commented 4 years ago

I do also have cuda 10.1 preinstalled, think it would make sense to first try to load this and run all steps again. But I'm not sure if I need to change the loaded cudnn version, actually it's still 7.0.0.

Edit: Still the same error occurs after loading cuda version 10.1.0.

cenit commented 4 years ago

Interesting. Let’s debug that, it might be easier.

Can you tell me what's the output of

echo $CUDA_PATH

of

echo $CUDA_TOOLKIT_ROOT_DIR

and of

echo $CUDA_BIN_PATH

?

andreasmarxer commented 4 years ago

I don't know why but there is no output when doing all of these commands. What am I doing wrong? echos

cenit commented 4 years ago

Nothing wrong, but unfortunately that's the reason why CUDA port file fails. Let me say that modules on your hpc machine have not been built correctly... Please set up manually CUDA_PATH pointing to the root folder of CUDA

export CUDA_PATH=/path/to/cuda/folder

if setup is ok, running this command

$CUDA_PATH/bin/nvcc --version

should be ok and should not give you a command not found. After that, retry installing CUDA & all other ports. To permanently fix the missing export variable, copy that line in your .basrhc file

andreasmarxer commented 4 years ago

But how can it be that this CUDA installation issue only pops up when using OpenCV? Otherwise, when OpenCV is disabled, the training works fine with GPU.

What do you mean with retry installing CUDA & all other ports? The command ./vcpkg install darknet[full]?

cenit commented 4 years ago

I was discussing vcpkg only in my last comment. It requires a proper cuda installation to work, and a proper cuda installation provides those symbols :)

andreasmarxer commented 4 years ago

What do you mean with those symbols? The variables $CUDA_PATH, $CUDA_TOOLKIT... etc. ?

cenit commented 4 years ago

Yes, please define them appropriately, using export like I was saying before, and then retry

andreasmarxer commented 4 years ago

I did again investigate in this issue also with the cluster support. Indeed it turned out that the OpenCV library was built without PEG Support. Now a new OpenCV version was built with JPEG support.

Unfortunately, also with the new version the issue stays and the images cannot be loaded when using OpenCV. The error for ./darknet detector train data/obj-120.data cfg/yolov3-obj.cfg backup/darknet53.conv.74 -ext_output -dont_show is as follows: Comp_Node_OpenCV

I also did look again at the mentioned variables of @cenit. I defined the $CUDA_PATH according to the root path of CUDA. But it didn't change something at the error.

Do I also need to define further variables? The CUDA paths I have knowledge from on the cluster are the following:

Or may I also need to define the same for the OpenCV variables? These are the paths were the OpenCV is located on the cluster and I have knowledge:

AlexeyAB commented 4 years ago

Do you get this issue Cannot load image if compile Darknet without OpenCV?

andreasmarxer commented 4 years ago

No, I only get it with OpenCV.

cenit commented 4 years ago

can you please share the logs about the initial cmake configuration? Please delete the build_ folder and then re-run the ./build.sh (if that is the way you are working), and copy all the messages that appear on the console here. It seems that CMake is still finding an OpenCV that is built without* jpeg support

barbara1990 commented 4 years ago

I'm just looking for a way to predict a list of images and save them all with a with a different name e.g. prediciton_img1.jpg if the image name was img1 and this all without OpenCV. @AlexeyAB With the actual implementation, it always saves the image as predictions.jpg and overwrites them.

@andreasmarxer Have you found a solution? I would need it as well

AlexeyAB / darknet

OpenCV error loading images #4121