brainglobe / BrainGlobe

General information, resources and dicussions for the BrainGlobe project.
https://brainglobe.info/
10 stars 1 forks source link

[Feature] Tensorflow will be an optional dependency due to conda-forge issues #27

Closed willGraham01 closed 10 months ago

willGraham01 commented 1 year ago

See issue on conda-forge/cellfinder-core for context.

Is your feature request related to a problem? Please describe. Any package with a hard dependency on tensorflow cannot be installed via conda-forge. This primarily affects the cellfinder-core package, but also it's dependents (cellfinder-napari, cellfinder, and thus the meta-package).

These packages will thus be updated to make tensorflow an optional dependency. pip will be able to install the optional tensorflow features because tensorflow is available from pip, and conda will have to make do with the "base" packages.

Describe the solution you'd like

  1. cellfinder-core needs to be updated with an optional tensorflow dependency. https://github.com/brainglobe/cellfinder-core/pull/186
    • [x] __init__.py will need to check whether the tensorflow dependency is available, and set an internal flag.
    • [x] tensorflow-dependent code will need to check this internal flag before running. From this, we will be able to create a complete list of all cellfinder-core features that depend on the tensorflow install.
    • [x] The package will need to be re-uploaded as a new version to PyPI. Whilst not a breaking change, the install instructions will need to be clearly updated to indicate that tensorflow needs to be explicitly requested.
  2. Packages that are dependent on cellfinder-core will also need to be updated to be compatible with or without tensorflow.
  3. conda-forge recipes for all concerned packages will need to be updated.
    • [ ] cellfinder-core
    • [ ] cellfinder-napari
    • [ ] cellfinder (not on conda-forge yet, and proposed to be removed so maybe un-necessary).
    • [ ] brainglobe-meta (in staging)

Describe alternatives you've considered See issue on conda-forge/cellfinder-core.

Alternative is we keep things as they are on PyPI, and put "broken" packages on conda-forge that require Windows users to manually install tensorflow first.

@adamltyson just to double-check this all looks good before I commit changes to multiple packages across the whole organisation.

adamltyson commented 1 year ago

Yeah sounds good. Presumably the pip metapackage could install cellfinder-core[tensorflow], it would just be the conda metapackage that would require the extra step?

willGraham01 commented 1 year ago

Presumably the pip metapackage could install cellfinder-core[tensorflow], it would just be the conda metapackage that would require the extra step?

Yeah, the plan will be

pip install brainglobe

to fetch cellfinder-core[tensorflow] (& the others with tensorflow included).

Meanwhile, the conda instructions will be

Manually download and install tensorflow in your environment, then run

conda install brainglobe

which will install the packages without fetching tensorflow.

But the __init__.py's will pickup on the manual install provided the correct environment is activated, so users will still be able to get full tool functionality.

adamltyson commented 1 year ago

Sounds good. Thanks @willGraham01!

willGraham01 commented 1 year ago

Having gone through cellfinder-core and drafting step 1 (see this PR); without tensorflow present, cellfinder-core can't perform any of it's primary features and is essentially an empty package / broken tool. If tensorflow isn't there, cellfinder-core (and thus cellfinder-napari and cellfinder) can't actually do anything except throw a "function not implemented due to missing dependency" error or something to that effect.

With this in mind, a potential alternative approach: pip install brainglobe fetches everything just fine, so let's leave the cellfinder source-code alone and work around the conda-forge issue for the time being.

Comparison

(Original) Plan in the issue body (New) Plan described here
pip install brainglobe fetches all brainglobe tools without issue. pip install brainglobe fetches all brainglobe tools without issue.
Source code & unit-test overhaul across 3/4 packages. Note this would all have to be undone if tensorflow did become available on conda in the future. Include guard in meta-package source code. conda recipe dependencies differ from PyPI package dependencies. Only conda recipe would need updating if tensorflow did become available on conda in the future.
conda install ships broken or not-implemented cellfinder tools. A conda install does not fetch cellfinder tools at all.
If tensorflow is manually installed before/after a conda install, cellfinder tools will start working. If tensorflow is manually installed before/after a conda install, cellfinder tools will still be absent - these will also have to be fetched manually or via pip.

Happy to continue with the original plan, however thought I'd air the alternative now that it comes to mind. The WIP changes based on the original plan are here if you want to evaluate for yourself.

adamltyson commented 1 year ago

I'm not sure what the best way to proceed is. I think two things are fairly clear:

Is it possible to do the following:

I think this is the best solution. If a user wants to use a cellfinder tool specifically, they can follow the pip instructions and it will all be fine. For newcomers to brainglobe, they can follow the conda one-liner, and this only gets complicated if, and only if, they want to use cellfinder.

As an aside, the tensorflow dependency is becoming a pain for multiple reasons, I wonder if it's worth moving to torch/jax?

willGraham01 commented 1 year ago

Include these packages (but no tensorflow) in the brainglobe conda package. This way, all the other tools will work, but users will get a warning if they try to use cellfinder-based tools?

The question is whether it's worth providing cellfinder in the conda package at all if we're not including tensorflow. We can provide a warning in for example brainglobe-meta's __init__.py:

try:
    TF_VERSION = version("tensorflow")
    TF_PRESENT = True
except PackageNotFoundError as e:
    TF_PRESENT = False
    # helpful message about tensorflow not being there, so cellfinder is unavailable

if TF_PRESENT:
   # expose cellfinder tools to the user
else:
   # do nothing? Or maybe import some functions that we can alias to play the role of "cellfinder" which just throw warnings and exit when invoked

and this way, we don't need to touch any code in the cellfinder, cellfinder-core, etc packages. And then if users really want cellfinder, they can pip install brainglobe as you say.

Only difference is in the niche case where someone conda installs brainglobe, then manually installs tensorflow. In which case, the new plan still wouldn't let them use cellfinder since conda wouldn't ship it at all. But the old plan of adding guards around all the cellfinder source functions would.

My preference is slightly for the newer plan since all our changes to fix (what I perceive as) a conda problem live either on conda, or in the meta-package. As opposed to all our cellfinder packages needing large changes to accomodate conda, and even then not actually providing the functions they say they would. But I'm persuad-able either way, maybe @alessandrofelder or anyone else who've worked on cellfinder's development has a preference?

adamltyson commented 1 year ago

Only difference is in the niche case where someone conda installs brainglobe, then manually installs tensorflow. In which case, the new plan still wouldn't let them use cellfinder since conda wouldn't ship it at all. But the old plan of adding guards around all the cellfinder source functions would.

Ah I didn't know that. Can we get the conda metapackage to ship a tensorflow-less cellfinder?

I just like the idea of the metapackage shipping as much as possible and reducing the number of steps (and time) for the user to use these tools.

Considering brainrender isn't on conda-forge yet, the conda metapackage is a bit dead in the water currently if it only ships 1/2 of our user facing tools.

willGraham01 commented 1 year ago

Can we get the conda metapackage to ship a tensorflow-less cellfinder?

We can; then the next question is what do we want to allow in the event of no tensorflow.

The latter means we wouldn't have to guard each individual function, but it would mean our conda-forge recipe wouldn't be able to check the import works (since tensorflow would be unavailable so this would always throw an error).

The former would mean the conda-forge recipe can check the import, but requires a lot of code changes across 3 packages (cellfinder, cellfinder-core, cellfinder-napari) that we'd have to undo if we ever moved away from tensorflow or it became available on conda.

From the user's perspective, both approaches are the same: they get an error when they try to use cellfinder without tensorflow (just either at import or at function-call). Both cases also mean that pip install cellfinder would not fetch tensorflow, it would need to be pip install cellfinder[tensorflow] (although pip install brainglobe would hide this wrinkle).

adamltyson commented 1 year ago

I'd vote for the latter.

willGraham01 commented 1 year ago

Will do - will update the PR to be the latter.

Considering brainrender isn't on conda-forge yet, the conda metapackage is a bit dead in the water currently if it only ships 1/2 of our user facing tools.

Also I didn't realise this - I've broken that out into https://github.com/brainglobe/brainglobe-meta/issues/14

willGraham01 commented 10 months ago

Closing as solved in https://github.com/conda-forge/cellfinder-core-feedstock/pull/14