i4Ds / Karabo-Pipeline

The Karabo Pipeline can be used as Digital Twin for SKA
https://i4ds.github.io/Karabo-Pipeline/
MIT License
11 stars 4 forks source link

fixed OSKAR library not found in Dockerfile #462

Closed Lukas113 closed 1 year ago

Lukas113 commented 1 year ago

There was an issue using all our docker-images up until version 0.18.0. run_simulation failed, where OSKAR python bindings claimed, that OSKAR library was not found (see image blow). This is essentially not true. They just failed an import of a sub-dependency. The issue was, that they captured their import error in settings_tree.py:

try:
    from . import _settings_lib
    from . import _apps_lib
except ImportError:
    _settings_lib = None
    _apps_lib = None

There they try to import _settings_lib.cpython-39-x86_64-linux-gnu.so and _apps_lib.cpython-39-x86_64-linux-gnu.so which are their own .so files available in the oskar package location. The main issue is, that there they point to the exact cuda-runtime file libcudart.so.11.0 (exact match for whatever reason) which comes with cuda-toolkit 11.* and didn't find it. This means, that the system must provide cuda-toolkit 11.* or OSKAR will fail when creating a settings-tree.

cuda-toolkit 11.* was not available our docker-environment. I thought it must come as a dependency since we already have some cuda-toolkit dependencies in karabo like cuda-cudart or libcufft. However, it was ont available. I also tried to install the cuda toolkit manually using conda install -c "nvidia/label/cuda-11.7.0" cuda-toolkit. But it didn't work either. I tried to install the cuda-toolkit from the official developer.nvidia. There I got an error, that the glibc version libc6 2.31-13+deb11u3 version of the previous used base-image continuumio/miniconda3 was too low, because to install the nvidia-toolkit 11.*, at least libc6 >=2.34 is required. Even their latest base image didn't reach the glibc requirement. I assume that this is the issue, why the cuda-toolkit wasn't successfully installed. But not sure though, I don't have experience for this kind of stuff.

So in the end, what worked was taking the base-image nvidia/cuda:11.7.1-cudnn8-devel-ubuntu22.04 which had the nvidia-toolkit installed and the required file was there. Also the glibc requirement (libc6 = 2.35) was met. The code shown in the image-below succeeded in the new build. So I think it is fine to include it as it is to the main-branch (even though the images are still not fully tested).

image

Lukas113 commented 1 year ago

Just to clarify: on my local machine with karabo installed, libcudart.so.11.0 is in ~/miniconda3/envs/karabo_dev_env/lib/libcudart.so.11.0, thus available through my virtual environment. I have also a glibc version of 2.35. Thus, I think we shouldn't update our direct dependencies. I assume that Karabo just doesn't run on systems with an old glibc library for that reason.