bioimage-io / collection-bioimage-io

(deprecated in favor of bioimage-io/collection) RDF collection for BioImage.IO
5 stars 9 forks source link

need to pin python for pip environments #325

Open FynnBe opened 2 years ago

FynnBe commented 2 years ago

I just tried locally and it works perfectly fine... Which python version are we trying this with? I think it only works for python 3.7 And follow-up question: can I pin the python version in requirements.txt and will this be picked up by our CI?

that is the problem. when I try to install the exact same test environment due to the two step process of installing pip and bioimageio.core and then using pip to install all pip dependencies, python get's resolved to version 3.10...

Originally posted by @FynnBe in https://github.com/bioimage-io/collection-bioimage-io/issues/324#issuecomment-1060731230

cc @constantinpape

uschmidt83 commented 2 years ago

The conda environment needs a Python version, or one should be able to pin it, right?

FynnBe commented 2 years ago

The conda environment needs a Python version, or one should be able to pin it, right?

yes, that is possible. The question is how we infer the correct pin

Is it legal in a pip requirements.txt to specify python (with version)?

uschmidt83 commented 2 years ago

No, I suggest like this: https://stackoverflow.com/a/59047379

channels:
  - conda-forge
dependencies:
  - python=3.7
  - ...
uschmidt83 commented 2 years ago

This is not only for pip, but for the whole conda env, or not?

uschmidt83 commented 2 years ago

This also implies that bioimageio.core has to work with potentially older Python version, such as 3.7. It currently is Python 3.7+ if I understand correctly.

FynnBe commented 2 years ago

This is not only for pip, but for the whole conda env, or not?

Yes, we need to pin the python of the conda environment. However we allow to specify dependencies with pip, as you do in the stardist model: dependencies: pip:requirements.txt From this we then create a minimal conda environment that includes these pip dependencies, such that internally we only have to deal with conda environments and not pip. Therefore I am looking for a way to pin python in the context of a requirements.txt (which does not seem possible).

trying to install python (in a conda env) with pip gives me:

ERROR: Could not find a version that satisfies the requirement python (from versions: none)
ERROR: No matching distribution found for python
FynnBe commented 2 years ago

It currently is Python 3.7+ if I understand correctly.

correct

FynnBe commented 2 years ago

I would like to avoid an additional field: python_version that one needs to specify when using pip dependencies... This field would be confusing for conda enviroments where python should be pinned in the env file

uschmidt83 commented 2 years ago

Therefore I am looking for a way to pin python in the context of a requirements.txt (which does not seem possible).

I forgot that we only write the requirements.txt, and not the entire conda env file.

FynnBe commented 2 years ago

I forgot that we only write the requirements.txt, and not the entire conda env file.

for now and for stardist feel free to switch to conda to avoid this, if that is not a big hassle for you and you'd prefer not to wait for this issue's resolution...

constantinpape commented 2 years ago

I forgot that we only write the requirements.txt, and not the entire conda env file.

Is there a chance to switch to conda in stardist? I don't see a reliable way to specify an environment for an old tf version since pip does not allow pinning the python version, see https://stackoverflow.com/questions/56713744/how-to-create-conda-environment-with-specific-python-version/59047379#59047379

Btw, if you do this you still need to put in tf as a pip requirement since 1.15 was never properly released via conda-forge. (It's such a pain to work with this outdated stuff....) I will create a manual example.

FynnBe commented 2 years ago

https://stackoverflow.com/questions/56713744/how-to-create-conda-environment-with-specific-python-version/59047379#59047379

did you link the correct issue? maybe something like https://stackoverflow.com/questions/11889932/specify-python-version-for-virtualenv-in-requirements-txt ?

If we don't want to completely drop pip support, we could check for a specific comment (comments are allowed: https://pip.pypa.io/en/latest/reference/requirements-file-format/#comments) that tags the python version

FynnBe commented 2 years ago

alternatively we might be able to check the pin for pip in the requirements.txt. At least it does not throw an error to install pip with pip, if there is no version clash:

$ pip install pip
Requirement already satisfied: pip in /home/fynn/conda/envs/6334698/lib/python3.10/site-packages (22.0.4)
$ pip install pip 21
Requirement already satisfied: pip in /home/fynn/conda/envs/6334698/lib/python3.10/site-packages (22.0.4)
ERROR: Could not find a version that satisfies
constantinpape commented 2 years ago

did you link the correct issue? maybe something like https://stackoverflow.com/questions/11889932/specify-python-version-for-virtualenv-in-requirements-txt ?

Yes, this is the exact issue I linked to ;). (Just at some fragment that didn't make much sense.)

If we don't want to completely drop pip support, we could check for a specific comment (comments are allowed: https://pip.pypa.io/en/latest/reference/requirements-file-format/#comments) that tags the python version

Yea, we can think about this. My initial thought is that we should drop pip support fully, but we should discuss this with others before moving ahead with any of this. For now it would be easiest to move ahead with conda for stardist models. I have created this manually for our test upload now.

FynnBe commented 2 years ago

@oeway @esgomezm @tomburke-rse @carlosuc3m quick question: Can we drop support for pip as a dependency manager as it does not allow to pin the python version (in a straight-forward fashion)? Note that it is always possible to define a conda environment with a pip section (which is how we tried to deal with pip's requirement.txt files, but this should better be done manually, see above discussion).

uschmidt83 commented 2 years ago

for now and for stardist feel free to switch to conda to avoid this, if that is not a big hassle for you and you'd prefer not to wait for this issue's resolution...

I first interpreted this that we should put stardist on conda-forge, which is not easy and I don't really want to do it. However, napari wants all plugins to be on conda-forge and they'll try to do it (see PR).

Upon reading the rest of the posts here, I think you meant that we write a conda environment file (instead of pip's requirements.txt) during the bioimage.io export in StarDist, which seems easy enough to do. Did you suggest this?

constantinpape commented 2 years ago

Upon reading the rest of the posts here, I think you meant that we write a conda environment file (instead of pip's requirements.txt) during the bioimage.io export in StarDist, which seems easy enough to do. Did you suggest this?

Yes, exactly that's what I meant. Sorry I was not quite precise here; of course putting StarDist on conda-forge is not necessary for our purposes since we can just specify pip dependencies in the conda env file. I have prepared an example for this already, and will send you the link as soon as we have made sure that it actually works.

Btw having StarDist on conda-forge would be great in general :).

uschmidt83 commented 2 years ago

I have prepared an example for this already, and will send you the link as soon as we have made sure that it actually works.

I'd also be happy with pointing me to the code/docs where I can read about this.

uschmidt83 commented 2 years ago

Is a conda environment.yaml file like this sufficient?

dependencies:
- python>=3.7,<3.8
- pip
- pip:
  - tensorflow>=1.15,<1.16
  - csbdeep>=0.6.3
  - scikit-image
  - numba
  - imageio
  - bioimageio.core>=0.5.0
  - importlib-metadata
constantinpape commented 2 years ago

Is a conda environment.yaml file like this sufficient?

Yes, that's sufficient in terms of dependencies; should we also add stardist as a dependency though in order to run the postprocessing? (It's not strictly needed to pass the tests, but I would still add it)

Also, you need name and channels (should only contain conda-forge) as well, but I assume you just left that out for brevity.

uschmidt83 commented 2 years ago

Yes, that's sufficient in terms of dependencies; should we also add stardist as a dependency though in order to run the postprocessing? (It's not strictly needed to pass the tests, but I would still add it)

I'm currently confused what this is environment is meant to be used for?

Also, you need name and channels (should only contain conda-forge) as well, but I assume you just left that out for brevity.

Which name should I use? Something fixed or random?

It doesn't need conda-forge as channel, I was thinking of just using channels: [defaults]. Opinions?

FynnBe commented 2 years ago

I'm currently confused what this is environment is meant to be used for?

for running the test (test_inputs produce test_outputs) and for documentation (this is why @constantinpape suggests to add the stardist dependency I assume...

Which name should I use? Something fixed or random?

completely up to you, for testing purposes we overwrite this field anyway.

It doesn't need conda-forge as channel, I was thinking of just using channels: [defaults]. Opinions?

as bioimageio.core is on conda-forge, we add conda-forge if not present to the channels. So maybe add it already so it changes less? But essentially this should not make a significant difference either way

constantinpape commented 2 years ago

It doesn't need conda-forge as channel, I was thinking of just using channels: [defaults]. Opinions?

In general conda-forge is preferable to defaults since it is usually more up-to-date and does not suffer from the commercialization conda issues. (But in this case it does not matter, so only default would be fine.)

uschmidt83 commented 2 years ago

Yes, that's sufficient in terms of dependencies; should we also add stardist as a dependency though in order to run the postprocessing? (It's not strictly needed to pass the tests, but I would still add it)

for running the test (test_inputs produce test_outputs) and for documentation (this is why @constantinpape suggests to add the stardist dependency I assume...

In order to run the model on your end, it should only need the tensorflow dependency (I think). All the other dependencies are of StarDist itself and not really relevant. I'm a bit confused what the semantics of that environment are. What do you mean by "documentation"?

It doesn't need conda-forge as channel, I was thinking of just using channels: [defaults]. Opinions?

as bioimageio.core is on conda-forge, we add conda-forge if not present to the channels. So maybe add it already so it changes less? But essentially this should not make a significant difference either way

But how do you know which version of bioimageio.core is compatible for the model spec? I guess you assume that the CI always runs the latest version and is thus always able to understand the given model spec?

FynnBe commented 2 years ago

well, if a user wants to use this model beyond the current bioimageio model zoo test, i.e. with the stardist postprocessing these dependencies would be needed. But in the context of bioimagieo we do not (yet) support this postprocessing. Maybe in the documentation or description content the postprocessing is referenced...? Or just for us in case we'll work on custom postproessing for this model... Keeping it minimal is a very justified approach as well, though. Currently only the tests in this repository and advanced users will read this file. None of the consumer softwares does afaik for now.

constantinpape commented 2 years ago

In order to run the model on your end, it should only need the tensorflow dependency (I think). All the other dependencies are of StarDist itself and not really relevant. I'm a bit confused what the semantics of that environment are. What do you mean by "documentation"?

We can also keep it minimal and only include tensorflow.

I suggested to add StarDist since you had CSBDeep in there already and adding a requirement file that sets up the full stardist compatible env would be a legit use of it. There is no 100% defined semantic to the environment file, only that it SHOULD list all requirements for the model, which MUST include the strict requirements for running the model and MAY include other requirements for post-processing operations that are currently not expressed in the model specification yet (like applying the stardist post-processing to obtain the instance segmentation).

But how do you know which version of bioimageio.core is compatible for the model spec? I guess you assume that the CI always runs the latest version and is thus always able to understand the given model spec?

The latest version should be able to load all spec versions, so for almost all use cases this does not need to be restricted and the CI will use the latest one. Anyway, I don't think you need to worry about this; you can use the defaults channel or conda-forge, it does not make any practical difference here. (I personally just never use defaults because it's pretty useless for more complex environments, so I just put conda-forge everywhere by default.)

uschmidt83 commented 2 years ago

I suggested to add StarDist since you had CSBDeep in there already and adding a requirement file that sets up the full stardist compatible env would be a legit use of it.

That goes back to code that Martin originally wrote, before we even used bioimageio.core.

There is no 100% defined semantic to the environment file, only that it SHOULD list all requirements for the model, which MUST include the strict requirements for running the model and MAY include other requirements...

Ok, I'll simply list stardist>=X.X.X as a dependency, which should be sufficient then.

The latest version should be able to load all spec versions, so for almost all use cases this does not need to be restricted and the CI will use the latest one.

Ok, but it can lead to bugs if a user were to use the environment file. I can do this automatically as long as the versions of bioimageio.core on pip and conda-forge are the same (which I assume they are).

constantinpape commented 2 years ago

bioimageio.core on pip and conda-forge are the same (which I assume they are).

Yes, they are the same since conda-forge is watching the pip release and updated based on this.

uschmidt83 commented 2 years ago

Ok, does this look good?

channels: [defaults, conda-forge]
dependencies:
- python>=3.7,<3.8
- bioimageio.core>=0.5.0
- pip
- pip: [stardist>=0.7.3, 'tensorflow>=1.15,<2']
name: stardist
uschmidt83 commented 2 years ago

https://github.com/stardist/stardist/commit/825c41b0c191bd728deca8cabca7ab3884b02100

constantinpape commented 2 years ago

Yes, this looks good!

esgomezm commented 2 years ago

@oeway @esgomezm @tomburke-rse @carlosuc3m quick question: Can we drop support for pip as a dependency manager as it does not allow to pin the python version (in a straight-forward fashion)? Note that it is always possible to define a conda environment with a pip section (which is how we tried to deal with pip's requirement.txt files, but this should better be done manually, see above discussion).

I'd rather keep supporting it for the collab notebooks as it makes everything easier. Installing conda environments in google collab is not impossible but I'm afraid of not getting a smooth integration in the ZCDL4M notebooks.

FynnBe commented 2 years ago

I'd rather keep supporting it for the collab notebooks as it makes everything easier. Installing conda environments in google collab is not impossible but I'm afraid of not getting a smooth integration in the ZCDL4M notebooks.

ok. How do you solve the python version problem there?

esgomezm commented 2 years ago

It goes with the python version already installed in the collab session...

FynnBe commented 2 years ago

right. of course! well, we'll still need to find a practical way to determine the python version in the BMZ context...