stark-t / PAI

Pollination_Artificial_Intelligence
5 stars 1 forks source link

detectron2 - installation & setting the environment, ImportError: libtorch_cuda_cu.so: cannot open shared object file #40

Closed valentinitnelav closed 1 year ago

valentinitnelav commented 2 years ago

FYI: I will also get the help of the cluster team from Uni Leipzig because like ScaledYOLOv4, there are some dependency issues and certain modules need to be loaded from the software tree.

module purge
module load PyTorch/1.10.0-foss-2021a-CUDA-11.3.1
module load TensorFlow/2.6.0-foss-2021a-CUDA-11.3.1
module load torchvision/0.11.1-foss-2021a-CUDA-11.3.1

python -m venv ~/venv/detectron2
source ~/venv/detectron2/bin/activate

pip install --upgrade pip
pip install opencv-python

pip install detectron2 -f \
  https://dl.fbaipublicfiles.com/detectron2/wheels/cu113/torch1.10/index.html

# ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. 
# This behaviour is the source of the following dependency conflicts.
# pandas 1.2.4 requires pytz>=2017.3, which is not installed.

# Test Inference Demo with Pre-trained Models
cd PAI/detectors/detectron2/demo
wget https://farm9.staticflickr.com/8267/8918904805_727d988709_z.jpg -q -O input1.jpg
wget https://farm1.staticflickr.com/215/492060815_ec07c64c09_z.jpg -q -O input2.jpg

python demo.py --config-file ../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml \
  --input input1.jpg input2.jpg \
  --opts MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl

# Traceback (most recent call last):
#   File "/PAI/detectors/detectron2/demo/demo.py", line 14, in <module>
#     from detectron2.data.detection_utils import read_image
#   File "/venv/detectron2/lib/python3.9/site-packages/detectron2/data/__init__.py", line 4, in <module>
#     from .build import (
#   File "/venv/detectron2/lib/python3.9/site-packages/detectron2/data/build.py", line 13, in <module>
#     from detectron2.structures import BoxMode
#   File "/venv/detectron2/lib/python3.9/site-packages/detectron2/structures/__init__.py", line 3, in <module>
#     from .image_list import ImageList
#   File "/venv/detectron2/lib/python3.9/site-packages/detectron2/structures/image_list.py", line 8, in <module>
#     from detectron2.layers.wrappers import shapes_to_tensor
#   File "/venv/detectron2/lib/python3.9/site-packages/detectron2/layers/__init__.py", line 3, in <module>
#     from .deform_conv import DeformConv, ModulatedDeformConv
#   File "/venv/detectron2/lib/python3.9/site-packages/detectron2/layers/deform_conv.py", line 11, in <module>
#     from detectron2 import _C
# ImportError: libtorch_cuda_cu.so: cannot open shared object file: No such file or directory

deactivate

The environment looks like this for now:

pip list

Package                    Version
-------------------------- ------------------
absl-py                    0.13.0
antlr4-python3-runtime     4.9.3
appdirs                    1.4.4
astor                      0.8.1
astunparse                 1.6.3
black                      21.4b2
Bottleneck                 1.3.2
cachetools                 4.2.2
certifi                    2022.6.15
charset-normalizer         2.1.0
clang                      5.0
click                      8.1.3
cloudpickle                2.1.0
cycler                     0.11.0
deap                       1.3.1
detectron2                 0.6+cu113
dill                       0.3.3
expecttest                 0.1.3
flatbuffers                2.0
fonttools                  4.34.4
future                     0.18.2
fvcore                     0.1.5.post20220512
gast                       0.4.0
google-auth                1.35.0
google-auth-oauthlib       0.4.5
google-pasta               0.2.0
grpcio                     1.39.0
gviz-api                   1.9.0
h5py                       3.2.1
hydra-core                 1.2.0
idna                       3.3
iopath                     0.1.9
keras                      2.6.0
Keras-Preprocessing        1.1.2
kiwisolver                 1.4.3
Markdown                   3.3.4
matplotlib                 3.5.2
mpi4py                     3.0.3
mpmath                     1.2.1
mypy-extensions            0.4.3
numexpr                    2.7.3
numpy                      1.20.3
oauthlib                   3.1.1
omegaconf                  2.2.2
opencv-python              4.6.0.66
opt-einsum                 3.3.0
packaging                  21.3
pandas                     1.2.4
pathspec                   0.9.0
Pillow                     8.2.0
pip                        22.1.2
portalocker                2.5.1
portpicker                 1.4.0
protobuf                   3.17.3
pyasn1                     0.4.8
pyasn1-modules             0.2.8
pybind11                   2.6.2
pycocotools                2.0.4
pydot                      1.4.2
pyparsing                  3.0.9
python-dateutil            2.8.2
PyYAML                     5.4.1
regex                      2022.7.9
requests                   2.28.1
requests-oauthlib          1.3.0
rsa                        4.7.2
scipy                      1.6.3
setuptools                 56.0.0
six                        1.16.0
tabulate                   0.8.10
tblib                      1.7.0
tensorboard                2.6.0
tensorboard-data-server    0.6.1
tensorboard-plugin-profile 2.5.0
tensorboard-plugin-wit     1.8.0
tensorflow                 2.6.0
tensorflow-estimator       2.6.0
termcolor                  1.1.0
toml                       0.10.2
torch                      1.10.0
torchvision                0.11.1
tqdm                       4.64.0
typing-extensions          3.10.0.0
urllib3                    1.26.10
Werkzeug                   2.0.1
wheel                      0.37.1
wrapt                      1.12.1
yacs                       0.1.8
valentinitnelav commented 2 years ago

I tried pip install pytz in the environment, but this didn't solve the issue. I get the same error message when I try the demo.

# Update the git repo
cd ~/PAI/detectorscd ~/PAI/detectors/
git pull

module purge
module load PyTorch/1.10.0-foss-2021a-CUDA-11.3.1
module load TensorFlow/2.6.0-foss-2021a-CUDA-11.3.1
module load torchvision/0.11.1-foss-2021a-CUDA-11.3.1

source ~/venv/detectron2/bin/activate

pip install detectron2 -f \
  https://dl.fbaipublicfiles.com/detectron2/wheels/cu113/torch1.10/index.html

# Didn't get any error message; only a series of `Requirement already satisfied` messages for each dependency.

pip install pytz

Related issues:

valentinitnelav commented 1 year ago

In this initial stage of the PAI project, we decided to not pursue further the employment of detectron2. detectron2 looks to be a bit heavier. Because we would like to deploy the trained weights on an edge device, then we need to focus on light models.