MPI-Dortmund / cryolo

cryolo documentation
8 stars 0 forks source link

Instructions to build from source / Update dependecies #14

Open seb45tian opened 1 year ago

seb45tian commented 1 year ago

Hi there,

I would like to suggest to create instructions to properly build/install cryolo without the use of Ana/Miniconda and while doing so update its dependencies to more up-to-date versions where possible.

Some background

We are frequently asked to install cryolo on our HPC cluster - which, from an admins perspective, is quite a nightmare. For HPC workloads we are always trying to provide the software running in the fastest possible way, as this can save weeks or even months of computational time. However, this requires a software stack tightly integrated with the underlying hardware. This usually means compilers, system libraries, MPI implementations, CUDA etc. are all compiled from source. I see two main problems here:

On software versions

While trying to install cryolo from source (i.e. installing the provided package from pypi.org) I had a look at the listed dependencies in the setup.py. My biggest concern is the outdated, unofficial TensorFlow version (1.15.5) provided by Nvidia to support CUDA-11. If in any way possible, I think a switch to TF 2.x is necessary.

For the remaining dependencies, I made a table below, listing the required version and the date when they were released. I know, some packages do not provide any updates, and are just "as is" - but this usually concerns only minor packages. Especially for packages like keras, pandas, numpy, scipy, scikit-learn and tensorflow it would be nice to have required versions, roughly released during the same time-frame as they are often depending on each other.

Packages with strict requirements
package required version date of release
protobuf == 3.20.0 Apr 2022
imageio >=2.3.0, <=2.15.0 Mar 2018, Feb 2022
pyStarDB == 0.3.1. Nov 2021
l lineenhancer == 1.0.9.dev2 Jan 2021
pandas == 1.1.4 Oct 2020
h5py >= 2.5.0, < 3.0.0 Apr 2015, Oct 2020
tifffile == 2020.9.3 Sep 2020
scikit-learn == 0.23.2 Aug 2020
numpy >= 1.16.0, < 1.19.0 Jan 2019, Jun 2020
Keras == 2.3.1 Oct. 2019
Packages which allow more up-to-date versions
package required version date of release
mrcfile >= 1.3.0 Feb 2021
GooeyDev >= 1.0.8b5 Nov 2020
wxPython >= 4.1.0 Apr 2020
scipy >= 1.3.0 May 2019
Pillow >= 6.0.0 Apr 2019

Thanks for reading, and I hope this does not come across as hostile - it is just meant to give another perspective and improve things in the long run. Happy to discuss things further or help testing in any way.

cheers

thorstenwagner commented 1 year ago

Thanks for this issue!

While I understand that using compiled from source libraries are faster, the setup on our HPC cluster which uses SLURM and environment modules is quite straightforward:

https://cryolo.readthedocs.io/en/stable/other/other.html#cryolo-integration-as-environment-module

Why is this not an option?

I had a hard time trying to convert the current code base to tensorflow 2.0 which unfortunately always failed. That's why I'm using the nvidia build for now.

Your request using strict environments I will consider and discuss with my colleagues.

Thanks again for your input!

Best, Thorsten

seb45tian commented 1 year ago

Hi Thorsten,

Thanks for your reply!

My issue is not about the possibility to install cryolo using Miniconda as per the instructions - this works as intended and we are also providing modules to load this in the users' environment. I just would like not to use Miniconda :-)

If you take a look at all the packages Miniconda installs (i.e. using conda list):

https://gist.github.com/seb45tian/61d721197c816723311d8f8bdb052018

you will find a lot of non-python, system packages which are quite vital for good performance, e.g.

libopenblas
libblas
libgomp
libclang
...

My point is, that such libraries should be optimised for the underlying system and hence not be installed via Miniconda. In addition, if using Miniconda, you have no control about the versions Miniconda is using for those dependencies - meaning this week Miniconda might have used libgomp 11.2.0, next week some packages changed and it will resolve to a newer version 12.x and so on. This might or might not have implications for your code or its results, but it is certainly not a reproducible build.

I completely understand the struggles to convert the code to TF 2.x and know it must be quite time consuming - it would just make combining it with other packages more straight forward.

Thanks again for considering this input and good luck with the further development!