visinf / irr

Iterative Residual Refinement for Joint Optical Flow and Occlusion Estimation (CVPR 2019)
Apache License 2.0
194 stars 32 forks source link

ImportError: libcudart.so.9.1: cannot open shared object file: No such file or directory #32

Closed RokiaAbdeen closed 3 years ago

RokiaAbdeen commented 3 years ago

import correlation_cuda ImportError: libcudart.so.9.1: cannot open shared object file: No such file or directory torch=1.1.0 torchvision=0.3.0

my system is ubuntu 16.04 I got this cuda version using this command: cat /usr/local/cuda/version.txt CUDA Version 9.0.176 but it is CUDA Version: 11.2 when i am using the command nvcc-smi how could I solve this problem please

the correlation cuda is run correctly without errors but this error happend when I start the training

hurjunhwa commented 3 years ago

Hi, When googling "libcudart.so.9.1: cannot open shared object file: No such file or directory", there are many relevant threads. Have you checked all of them? For example this? https://github.com/pytorch/pytorch/issues/10910

RokiaAbdeen commented 3 years ago

yes I have checked them all that's what makes me frustrated , I've removed the nvidia driver many times and I've changed the cuda version many times but still don't know how this error happened :(

now I am working on ubuntu 16.04 , cuda 9.0 , torch 1.1.0 ,torchvision 0.3.0 and python 3.7 and still getting this error correlation package run without any errors , I just got this error when I am trying to train the model

hurjunhwa commented 3 years ago

I see.. that sounds so frustrating :(.

If you think that the correlation layer could be still one of the causes, possibly you can use the pure python implementation that can be found here: https://github.com/google-research/google-research/blob/ec13eb6661a7b9500016cc6d7e3ab940c2dbf184/uflow/uflow_model.py#L88

I am sorry but I don't know what could be the cause. Making sure that the LD_LIBRARY_PATH and PATH properly direct the cuda directories, can you maybe track down which line of source code cause the error?