fedbiomed / fedbiomed

A collaborative learning framework for empowering biomedical research
https://fedbiomed.org
Other
37 stars 4 forks source link

[merge Flamby] specify Flamby package in fedbiomed-node/researcher.yaml #337

Closed srcansiz closed 1 year ago

srcansiz commented 2 years ago

In GitLab by @marcolorenzi on Aug 22, 2022, 11:21

The use of Flamby in Fed-BioMed requires to specify the flamby installation directory in the .yaml file (node+researcher). Currently it is done by hand. How to enable and automate this option at installation time?

srcansiz commented 2 years ago

In GitLab by @sharkovsky on Sep 12, 2022, 14:58

Even before the issue of configuring the yaml file, there is an issue with the fact that the flamby installation is very manual. The problems are:

  1. installation from git directory, there is no notion of release. We may wish to freeze to a certain commit maybe?
  2. installation from git directory: PyTorch and several other packages were reinstalled during make install. This could potentially create a conflict.
  3. make install insists on creating a new conda environment.

However, I see now that there is a setup.py and indeed it seems that we are only relying on that (and assuming we could git clone the folder). So maybe issues 2 and 3 above do not apply.

srcansiz commented 2 years ago

In GitLab by @sharkovsky on Sep 12, 2022, 15:03

We may consider something like this instead:

pip install git+https://github.com/owkin/FLamby@main

see here

srcansiz commented 2 years ago

In GitLab by @sharkovsky on Sep 12, 2022, 17:29

As explained in #324, the current state of the art requires manually executing a script to download datasets. I am not sure how we can integrate this in a more automatized way.

srcansiz commented 2 years ago

In GitLab by @sharkovsky on Sep 12, 2022, 17:51

It seems that this approach does not fully replace the manual approach of git clone + manually setting the directory name.

I'll investigate whether this is something that can be fixed by a improving the setup.py and other python packaging things (and maybe submit a pull request) or if it's something that requires too much effort.

srcansiz commented 2 years ago

In GitLab by @sharkovsky on Sep 23, 2022, 15:07

After an upstream fix, we have the following situation:

Hence, we can install the library and perform some basic testing, but in order to actually use the datasets we still require manual input.

srcansiz commented 2 years ago

In GitLab by @sharkovsky on Sep 26, 2022, 15:51

The Flamby datasets have a lot of hidden dependencies. Some have been arbitrarily already added to the conda envs, but not all. We should either include all of the dependencies, or none of them.