Closed Lukas113 closed 8 months ago
Attention: 20 lines
in your changes are missing coverage. Please review.
Comparison is base (
e0517ae
) 65.76% compared to head (745d018
) 65.72%.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
There are still issues:
ImportError: libboost_python39.so.1.82.0: cannot open shared object file: No such file or directory
. Not sure if bdsf is poining to an exact libbost-file since it's compiled. If yes, then we need to resolve & find a with the Karabo dependencies resolvable libboost-version first, and then fix it in the build, so that the exact version gets installed in the karabo-env again.karabo-pipeline-0.21.dev0-py39_1000 requires aratmospy 1.0.dev0.*, but none of the providers can be installed
. Have to build Karabo-dev probably new with adapted feedstock-builds.conda-build
workflow. Probably because the arguments are not passed through correctly or there is an error in an if-statement?Ok, regarding boost
I found out what the issue is/was. The issue is, that boost, always need to be pinned at runtime to the version that was present at build time. E.g. casacore
when I tried installing another boost version where I had an environment with an already installed casacore, I got a wheel which had a different build-nr and package-hash.
Fortunately, conda build provides a solution to this issue by using pin_run_as_build
, where you do a build for each boost
version, and pin it to each build.
The according changes were made in the Karabo-Feedstock and the installation succeeded. Now, we have a bdsf
build for each compatible booost-version >= 1.74 on anaconda.org with the same build-nr but a different package-hash. Also other stuff has been adapted in the feedstock to enable a decent dependency-resolving and hopefully prevent build & installation errors.
The only issue that still remains is the issue with the docker-image (see previous post). As soon as this is resolved, I'd be happy about a review so that it's ready to merge.
I'm not really sure about the merging process, since the builds on the feedstock needs to be triggered at the same time. Probably at release time? In addition, I'm not sure if it will break older karabo-releases since there are new builds of the feedstock with the same package-version and therefore a higher build-nr. I think they shouldn't break? However, unfortunately I can't really test that.
There's currently an issue with the GitHub runners (see issue 3025) which prevents me from properly testing the workflows before they live in the default-branch.
What I mean that it's not impossible, however, testing has a side-effect of building and publishing a docker-container each time I try to test some github-workflows which I changed.
For me this is pretty hard to review. Anything I should focus on? Overall, it seems OK but like I said, it's hard to test or review.
@kenfus Just do a review wherever you feel you can provide some feedback. If you don't feel comfortable about reviewing on details, I suggest just focus on the design-choices.
Please remember to not merge this PR because of the dev-dependencies in environment.yaml
& meta.yaml
Ok, merging should be fine now. I've fixed feedstock-dependencies of karabo and in the feedstock itself to a fixed build-number. This should ensure, that future updates of the same package-version with a higher build-number don't affect older installations. I've tested the installation locally, and everything seems to work fine:
I hope that I didn't break older conda-wheels because of the updates I did. However, I don't see a good way how to fix that in case they would be broken. However, with the new setup of fixed build-numbers, this shouldn't happen anymore in the future.
The current implementation of fixed build-numbers in the feedstock is currently a little bit messy, because they're hard-coded there. However, I think the feedstock needs a refactoring anyway. But until we get to this point, I think we can leave it as it is at the moment.
As mentioned, the current Dockerfiles don't allow a native mpi-hook by sarus, because the mpi-implementation doesn't live in a standard location. @deiruch mentioned, that mpich should be installable via yum or apt. Maybe this makes the installation of mpi more easy and could be done on the github-runners. However, I think this should be done in a separate PR since this PR is already too big which is totally my fault.
LGTM, apart from the mpi-hook/mpich-compilation things we talked about last week. (Use binary mpich release, instead of recompiling it)
Ok, I'll have a look if it works locally and on the runners.
@deiruch
mpich is installable through apt
. However, the only version available for ubuntu-22.04 is 4.0-2. So to ensure compatibility, I tried to fix mpich to 4.0 in the environment.yaml. However, this is not a resolvable environment, which got the error-mesage posted below. The issue is when h5py
is fixed to any mpich-build, the environment is not resolvable. Thus, the only way I see to resolve this is to make a decision.
Either, don't allow a native mpi-hook, but instead get a h5py
mpich-wheel
OR downgrade mpich
from 4.1 to 4.0 and dissalow an mpich-wheel of h5py
Since I don't know the consequences of either decision, I'm not sure which one is better or worse.
(karabo) root@9616a2df8d03:/repo# conda env update -f="environment.yaml"
Channels:
- i4ds
- nvidia/label/cuda-11.7.0
- conda-forge
- defaults
Platform: linux-64
Collecting package metadata (repodata.json): done
Solving environment: failed
Channels:
- i4ds
- nvidia/label/cuda-11.7.0
- conda-forge
- defaults
Platform: linux-64
Collecting package metadata (repodata.json): done
Solving environment: failed
LibMambaUnsatisfiableError: Encountered problems while solving:
- package h5py-3.4.0-mpi_mpich_py310h83ac6b3_2 requires python >=3.10,<3.11.0a0, but none of the providers can be installed
Could not solve for environment specs
The following packages are incompatible
├─ bluebild 0.1.0 *_0 is installable with the potential options
│ ├─ bluebild 0.1.0 would require
│ │ └─ python >=3.9,<3.10.0a0 with the potential options
│ │ ├─ python [3.9.0|3.9.1|...|3.9.7], which can be installed;
│ │ ├─ python [3.9.0|3.9.1|...|3.9.9] would require
│ │ │ └─ python_abi 3.9.* *_cp39, which can be installed;
│ │ └─ python [3.9.10|3.9.12|3.9.16|3.9.17|3.9.18] would require
│ │ └─ python_abi 3.9 *_pypy39_pp73, which can be installed;
│ └─ bluebild 0.1.0 would require
│ ├─ python >=3.10,<3.11.0a0 , which conflicts with any installable versions previously reported;
│ └─ python_abi 3.10.* *_cp310, which conflicts with any installable versions previously reported;
├─ h5py * mpi_mpich* is installable with the potential options
│ ├─ h5py [2.10.0|2.9.0|...|3.6.0], which can be installed;
│ ├─ h5py [2.10.0|3.1.0] would require
│ │ └─ python >=3.6,<3.7.0a0 , which conflicts with any installable versions previously reported;
│ ├─ h5py [2.10.0|3.1.0|...|3.6.0] would require
│ │ └─ python >=3.7,<3.8.0a0 , which conflicts with any installable versions previously reported;
│ ├─ h5py [2.10.0|3.1.0|...|3.6.0] would require
│ │ └─ python >=3.8,<3.9.0a0 , which conflicts with any installable versions previously reported;
│ ├─ h5py [3.4.0|3.6.0] would require
│ │ └─ python >=3.10,<3.11.0a0 , which conflicts with any installable versions previously reported;
│ ├─ h5py [3.7.0|3.8.0] would require
│ │ ├─ python >=3.10,<3.11.0a0 , which conflicts with any installable versions previously reported;
│ │ └─ python_abi 3.10.* *_cp310, which conflicts with any installable versions previously reported;
│ ├─ h5py [3.7.0|3.8.0] would require
│ │ ├─ python >=3.11,<3.12.0a0 , which conflicts with any installable versions previously reported;
│ │ └─ python_abi 3.11.* *_cp311, which conflicts with any installable versions previously reported;
│ ├─ h5py 3.7.0 would require
│ │ ├─ python >=3.7,<3.8.0a0 , which conflicts with any installable versions previously reported;
│ │ └─ python_abi 3.7.* *_cp37m, which conflicts with any installable versions previously reported;
│ ├─ h5py [3.7.0|3.8.0] would require
│ │ ├─ python >=3.8,<3.9.0a0 , which conflicts with any installable versions previously reported;
│ │ └─ python_abi 3.8.* *_cp38, which conflicts with any installable versions previously reported;
│ ├─ h5py [3.7.0|3.8.0] would require
│ │ └─ hdf5 >=1.12.2,<1.12.3.0a0 mpi_mpich_* with the potential options
│ │ ├─ hdf5 1.12.2 would require
│ │ │ └─ openssl >=1.1.1q,<1.1.2a , which can be installed;
│ │ ├─ hdf5 1.12.2, which can be installed;
│ │ └─ hdf5 [1.12.2|1.14.0] would require
│ │ └─ openssl >=1.1.1s,<1.1.2a , which can be installed;
│ ├─ h5py 3.7.0 would require
│ │ └─ hdf5 >=1.12.1,<1.12.2.0a0 mpi_mpich_* with the potential options
│ │ ├─ hdf5 1.12.1 would require
│ │ │ └─ openssl >=1.1.1l,<1.1.2a , which can be installed;
│ │ ├─ hdf5 1.12.1, which can be installed;
│ │ └─ hdf5 1.12.1 would require
│ │ └─ openssl >=1.1.1k,<1.1.2a , which can be installed;
│ ├─ h5py [3.10.0|3.9.0] would require
│ │ └─ mpich >=4.1.2,<5.0a0 , which can be installed;
│ ├─ h5py [3.10.0|3.9.0] would require
│ │ └─ python_abi 3.12.* *_cp312, which conflicts with any installable versions previously reported;
│ ├─ h5py 3.8.0 would require
│ │ └─ hdf5 >=1.14.0,<1.14.1.0a0 mpi_mpich_* with the potential options
│ │ ├─ hdf5 [1.12.2|1.14.0], which can be installed (as previously explained);
│ │ ├─ hdf5 1.14.0 would require
│ │ │ └─ openssl >=1.1.1t,<1.1.2a , which can be installed;
│ │ └─ hdf5 1.14.0, which can be installed;
│ ├─ h5py 3.9.0 would require
│ │ └─ mpich >=4.1.1,<5.0a0 , which can be installed;
│ └─ h5py 3.9.0 would require
│ └─ hdf5 >=1.14.1,<1.14.2.0a0 mpi_mpich_*, which can be installed;
├─ montagepy 6.0.0 *_0 is uninstallable because there are no viable options
│ ├─ montagepy 6.0.0 would require
│ │ ├─ python >=3.10,<3.11.0a0 , which conflicts with any installable versions previously reported;
│ │ └─ python_abi 3.10.* *_cp310, which conflicts with any installable versions previously reported;
│ └─ montagepy 6.0.0 would require
│ ├─ python >=3.9,<3.10.0a0 with the potential options
│ │ ├─ python [3.9.0|3.9.1|...|3.9.7], which can be installed;
│ │ ├─ python [3.9.0|3.9.1|...|3.9.9], which cannot be installed (as previously explained);
│ │ └─ python [3.9.10|3.9.12|3.9.16|3.9.17|3.9.18], which can be installed (as previously explained);
│ └─ python_abi 3.9.* *_cp39 but there are no viable options
│ ├─ python_abi 3.9 conflicts with any installable versions previously reported;
│ └─ python_abi 3.9 would require
│ └─ python 3.9.* *_cpython, which conflicts with any installable versions previously reported;
├─ mpich 4.0** is uninstallable because it conflicts with any installable versions previously reported;
├─ oskarpy 2.8.3 *_0 is uninstallable because there are no viable options
│ ├─ oskarpy 2.8.3 would require
│ │ ├─ oskar 2.8.3 *_0, which requires
│ │ │ └─ hdf5 >=1.14.2,<1.14.3.0a0 but there are no viable options
│ │ │ ├─ hdf5 1.14.2 conflicts with any installable versions previously reported;
│ │ │ └─ hdf5 1.14.2 would require
│ │ │ └─ openssl >=3.1.2,<4.0a0 , which conflicts with any installable versions previously reported;
│ │ └─ python_abi 3.10.* *_cp310, which conflicts with any installable versions previously reported;
│ └─ oskarpy 2.8.3 would require
│ ├─ oskar 2.8.3 *_0, which cannot be installed (as previously explained);
│ └─ python >=3.9,<3.10.0a0 with the potential options
│ ├─ python [3.9.0|3.9.1|...|3.9.7], which can be installed;
│ ├─ python [3.9.0|3.9.1|...|3.9.9], which cannot be installed (as previously explained);
│ └─ python [3.9.10|3.9.12|3.9.16|3.9.17|3.9.18], which can be installed (as previously explained);
├─ python 3.9** is installable with the potential options
│ ├─ python [3.9.0|3.9.1|...|3.9.7], which can be installed;
│ ├─ python [3.9.0|3.9.1|...|3.9.9], which cannot be installed (as previously explained);
│ └─ python [3.9.10|3.9.12|3.9.16|3.9.17|3.9.18], which can be installed (as previously explained);
├─ ska-gridder-nifty-cuda 0.3.0 *_0 is uninstallable because there are no viable options
│ ├─ ska-gridder-nifty-cuda 0.3.0 would require
│ │ └─ python_abi 3.9.* *_cp39 but there are no viable options
│ │ ├─ python_abi 3.9 conflicts with any installable versions previously reported;
│ │ └─ python_abi 3.9, which cannot be installed (as previously explained);
│ └─ ska-gridder-nifty-cuda 0.3.0 would require
│ └─ python >=3.10,<3.11.0a0 , which conflicts with any installable versions previously reported;
├─ ska-sdp-datamodels 0.1.3 *_0 is installable and it requires
│ └─ h5py >=3.7.0 with the potential options
│ ├─ h5py [3.7.0|3.8.0], which cannot be installed (as previously explained);
│ ├─ h5py [3.7.0|3.8.0], which cannot be installed (as previously explained);
│ ├─ h5py 3.7.0, which cannot be installed (as previously explained);
│ ├─ h5py [3.7.0|3.8.0], which cannot be installed (as previously explained);
│ ├─ h5py [3.7.0|3.8.0], which can be installed (as previously explained);
│ ├─ h5py 3.7.0, which can be installed (as previously explained);
│ ├─ h5py [3.10.0|3.9.0], which can be installed (as previously explained);
│ ├─ h5py [3.10.0|3.9.0], which cannot be installed (as previously explained);
│ ├─ h5py 3.8.0, which can be installed (as previously explained);
│ ├─ h5py 3.9.0, which can be installed (as previously explained);
│ ├─ h5py 3.9.0, which can be installed (as previously explained);
│ └─ h5py [3.10.0|3.7.0|3.8.0|3.9.0] conflicts with any installable versions previously reported;
└─ tools21cm 2.0.2 *_0 is installable and it requires
└─ python 3.9.* with the potential options
├─ python [3.9.0|3.9.1|...|3.9.7], which can be installed;
├─ python [3.9.0|3.9.1|...|3.9.9], which cannot be installed (as previously explained);
└─ python [3.9.10|3.9.12|3.9.16|3.9.17|3.9.18], which can be installed (as previously explained).
For now we don't need h5py with mpi. It's only useful to allow parallel writing of HDF5 files, which we don't use AFAIK (see https://docs.h5py.org/en/stable/mpi.html).
I've discovered a bug which I have to fix. Currently, for docker-containers everything works as intended. However, on Singularity and Sarus containers, the conda-setup doesn't work yet. This means that there, the wrong interpreter is selected per default. Also switching venvs is not straight-forward because conda activate
doesn't work out of the box.
For some reason, the CMD
and ENTRYPOINT
doesn't seem to work for non-docker containers (maybe just the way I used them, or entirely? not sure though). However, I need them because otherwise the .bashrc
won't get executed, which is essential to switch a conda venv. Note: we need to have a venv-setup to not accidentally damage the base-environment.
Currently, I'm not sure how to solve this issue. Therefore, it's work in progress.
There's also a bug with the mpich-hook. The installation in the container (conda dep resolving & apt-installation) and the conda-link to standard-location worked. However, the hook didn't. This is also work in progress.
The issue of not being able to switch venvs in singularity containers is not straight-forward. The issue lies how docker and singularity handle user-ids and what directories and files get mounted by singularity, which makes the initialization of conda very difficult. The issue regarding user-ids is, that one enters the a singularity container with the same user-id as the user who launched the process, while the files & directories of docker are created from root. A detailed description of the issue can be seen here. Currently I'm exploring solutions to this issue. I want to avoid creating a Singularity file, because that would imply that one has to get this file before creating a singularity container which is not optimal.
Ok, the singularity issue seems to be fixed. The following docker-commands did the job:
# set bash-env accordingly for interactive and non-interactive shells for docker & singularity
RUN mkdir opt/etc && \
echo "conda activate karabo" >> ~/.bashrc && \
cat ~/.bashrc | sed -n '/conda initialize/,/conda activate/p' > /opt/etc/conda_init_script
ENV BASH_ENV=/opt/etc/conda_init_script
RUN echo "source $BASH_ENV" >> /etc/bash.bashrc && \
echo "source $BASH_ENV" >> /etc/profile
It configures the conda init & activate process for each use-case, ineractive and non-interacive bash-shell for docker & singularity.
Currently not working are Singularity containers due to permission issues & native mpi-hook of Sarus containers. As discussed in the meeting at 21-12-23, these will be fixed in another PR.
I've fixed the mpi-issue and have tested a simple example on cscs srun -N2 -n2 -C gpu -A sk05 sarus run --mpi ghcr.io/i4ds/karabo-pipeline:0.21.dev1 python -c 'print("ABCDEFG")'
with and without the mpi-hook and it worked (output was 2x ABCDEFG). The only thing left to do is that karabo also works for singularity containers, which is work in progress.
Also all mpi-tests passed with the following command: srun -N2 -n2 -C gpu -A sk05 sarus run --mpi ghcr.io/i4ds/karabo-pipeline:0.21.dev1 pytest --only-mpi /opt/conda/envs/karabo/lib/python3.9/site-packages/karabo/test
closes #449 closed #189
For testing-purpose, this PR needs it's dev-dependencies, referenced in
environment.yaml
andmeta.yaml
. However, BEFORE merging into the default brach, they need to be adapted and the according builds of the Karabo-Feedstock must be built at more or less the same time.This PR was initially intended to be much smaller. However, because I wasn't able to test my changes with the old setup, I also had to define a dev & test-setup for Karabo to not interfere with the live conda-wheel and docker-images.
This PR also includes changes in the Krabo-Feedstock, where a lot of dependencies were adjusted. The reason was originally because we had some open-mpi builds, which needed to be adapted. However, during the time I was working on this PR, we decided to remove the mpi-builds with the PR #521 from Karabo. This may seem like the work I was doing there is not necessary. However, I also needed to adapt the build-string setup to enable best-practice conda-builds, which depends usually on the python-version, package-hash and build-nr. Only then, the conda installer will be able to select the dependency wheels properly and can decide whether to download & install a new wheel or take it from the cache. The same goes for dev-builds. In addition, I loosened well-known and stable dependency-constraints. This had a positive side-effect that we are no more constrained as we used to be regarding upgrading dependencies, which solves the issue #449 .
This PR has the following changes in this repository:
numpy
,dask
orxarray
.versioneer
pyproject.toml
instead ofsetup.cfg
-> got rid of requirements.txtxarray>2023.2
numpy
orxarray