bioimage-io / spec-bioimage-io

Specification for the bioimage.io model description file.
https://bioimage-io.github.io/spec-bioimage-io/
MIT License
18 stars 17 forks source link

Notebook dependencies #535

Open FynnBe opened 1 year ago

FynnBe commented 1 year ago

As discusses in today's WP4 meeting we see a need to include a depencencies field in the notebook spec.

Possibly additional fields referencing a docker file and image might be needed/useful.

cc @bioimage-io/spec-dev

esgomezm commented 1 year ago

Hi there,

We just launched a platform to containerise notebooks such as the ones from ZeroCostDL4Mic: https://github.com/HenriquesLab/DL4MicEverywhere

We decided to use a yaml format to make any future conversion with the zoo easier. Same as for the models, we got a description of the minimum dependencies needed to build a working docker image. These are the ones we got and that would be useful, at least for us, to have in the specs.

#Already in the spec:
notebook_url: https://raw.githubusercontent.com/HenriquesLab/ZeroCostDL4Mic/master/Colab_notebooks/U-Net_2D_Multilabel_ZeroCostDL4Mic.ipynb
description: "U-Net_2D_Multilabel_DL4Mic is the conversion of the 2D U-Net Multilabel from ZeroCostDL4Mic."

# New
version: 0.0.1 #version of the container image for this specific notebook to work with these particular requirements
requirements_url: https://raw.githubusercontent.com/HenriquesLab/ZeroCostDL4Mic/master/requirements_files/2D_UNet_multilabel_requirements_simple.txt
cuda_version: 11.8.0
ubuntu_version: 22.04
python_version: 3.10

# Especific to DL4MicEverywhere
sections_to_remove: 1.1. 1.2. 2. 6.2. 

How easy or difficult do you feel it is work on this? @FynnBe @oeway @IvanHCenalmor @mariana-gferreira

oeway commented 1 year ago

How easy or difficult do you feel it is work on this? @FynnBe @oeway @IvanHCenalmor @mariana-gferreira

I think you can just go ahead and add these fields, adding any fields is valid for now as a type: application for the bioimageio.spec. Our main focus for the bioimageio.spec for now is mainly on the models. ZeroCost Notebooks are just one specific type of consumer.

It might be better if you create separate spec validators for zerocost notebooks, and run it for all the notebooks in the zerocost repo.

esgomezm commented 1 year ago

Hi Wei,

Yes, yes I know I can add it, this was an effort to start defining something that we can all use. There is no rush if you think it won't happen in the short future.

It might be better if you create separate spec validators for zerocost notebooks, and run it for all the notebooks in the zerocost repo.

We have it already in the new repository ;)

oeway commented 1 year ago

Yes, yes I know I can add it, this was an effort to start defining something that we can all use. There is no rush if you think it won't happen in the short future.

Ok, got it!

Maybe I missed the discussion, do you mean to provide some extra annotation for users who would like to run notebooks (which are not zerocost notebooks)?

Do you have a specific use case besides zerocost notebooks where these fields could be useful?

The closest thing I can think of is the Binder, which uses repo2docker, which supports a set of configuration file: https://repo2docker.readthedocs.io/en/latest/config_files.html that are similar to the fields you provided.

P.S. The sections_to_remove sounds a bit too specific for zerocost notebooks, why not just remove them ;)

esgomezm commented 1 year ago

Ok, sorry. We discussed this in the WP4 meeting and I forgot to add some context

So, one of the DL4MicEverywhere features is to containerise ZeroCostDL4Mic or any notebook given those specs. It's meant to provide what we promise in AI4Life WP4. You can find the FRUNet with this functionality here. I'm not sure if there are other notebooks in the zoo but happy to test.

The closest thing I can think of is the Binder, which uses repo2docker, which supports a set of configuration file: https://repo2docker.readthedocs.io/en/latest/config_files.html that are similar to the fields you provided.

Yes, but still these would need to be integrated into the bioimageio specs. That's why I suggested trying a way to converge. In this case, rather than repos, the idea is to containerise notebooks with their respective models.

P.S. The sections_to_remove sounds a bit too specific for zerocost notebooks, why not just remove them ;)

Yes exactly, so this is what we need to expand ZeroCost. It's a specific feature that could be in the very specific fields of the tool.

FynnBe commented 1 year ago

New

version: 0.0.1 #version of the container image for this specific notebook to work with these particular requirements

side note: semver 2.0 encourages to start with 0.1.0: https://semver.org/#how-should-i-deal-with-revisions-in-the-0yz-initial-development-phase

requirements_url: https://raw.githubusercontent.com/HenriquesLab/ZeroCostDL4Mic/master/requirements_files/2D_UNet_multilabel_requirements_simple.txt cuda_version: 11.8.0 ubuntu_version: 22.04 python_version: 3.10

have you considered dependency management with conda?

esgomezm commented 1 year ago

side note: semver 2.0 encourages to start with 0.1.0: https://semver.org/#how-should-i-deal-with-revisions-in-the-0yz-initial-development-phase

Thank you!!

have you considered dependency management with conda?

We can take a look at it but I should say that it did not work for us many times in mac. Conda forge is getting better but still sometimes we do not get all the dependencies needed. Does it make a difference once you managed building the docker image with all the dependencies? is it more "stable"?

FynnBe commented 1 year ago

We can take a look at it but I should say that it did not work for us many times in mac. Conda forge is getting better but still sometimes we do not get all the dependencies needed. Does it make a difference once you managed building the docker image with all the dependencies? is it more "stable"?

Just checked with @k-dominik who reports conda/mamba and mac work well with each other. But if you run it in docker you wouldn't have to care about that anyway, as your image could be linux based...

FynnBe commented 1 year ago

I can recommend https://mamba.readthedocs.io/en/latest/user_guide/micromamba.html in general for conda package managing

ctr26 commented 1 year ago

I usually do

FROM mambaorg/micromamba Or repo2docker to blend conda and containers

IvanHCenalmor commented 1 year ago

I created this branch https://github.com/HenriquesLab/DL4MicEverywhere/tree/Pass-to-conda to try to use conda/mamba environments inside the docker image because actually is a very good idea. I will take a look to the 'FROM mambaorg/micromamba', the problem there is that we lose the control of in what OS is installed no? Also I have quickly checked the 'repo2docker' and it is a super cool tool but I think that as we are not working with Git repositories it will not work no?

ctr26 commented 1 year ago

You can use repo2docker to just build images locally without needing to use GitHub