JaneliaSciComp / SongExplorer

deep learning for acoustic signals
BSD 3-Clause "New" or "Revised" License
22 stars 5 forks source link

conda recipe discussion #13

Closed stuarteberg closed 8 months ago

stuarteberg commented 2 years ago

These comments are written with the presumption that you'll be contributing your recipe to conda-forge.


post-link.sh

Installing pip packages via a post-link.sh script is a bit strange. I think most users would not expect that to happen. I also doubt it would be approved in a conda-forge feedstock.

As a compromise, perhaps you could merely recommend to your user that they run pip install for the extra speedup? See here.


meta.yaml

source

Just FYI, when it's time to submit a recipe to conda-forge, you'll need to replace git_url with a tarball url.

requirements:run:

At least one of the packages listed in meta.yaml is not available on conda-forge as far as I can tell: tensorflow-deps. Is it really necessary? This comment suggests that you don't need that dependency.

Same goes for cuda-nvcc and perhaps others. Can they be omitted, and possibly recommended to the users as optional dependencies (either in the docs or in post-link.sh as described above)?

test

You'll definitely need a test section before conda-forge will accept the recipe.


build.sh

The install script for this package is rather non-standard. That's owing to the fact that this project is not structured in the standard Python way. From one perspective, that's irrelevant to conda-forge -- they are typically concerned with installing things according to the upstream package's expectations. Still, in the interest of long-term maintainability, it might be worth making SongExplorer look more like most other Python tools.

ngam commented 2 years ago

Specifically for tensorflow, you really only need "tensorflow" and we do everything behind the scene to get you a working tensorflow, including all cuda-relevant deps. Tensorflow-deps is an Apple specific thing (targeting Metal GPUs) and it is already 100% accounted for in tensorflow proper anyway. We just don't have Metal support because it is closed source still.

TLDR regarding cuda: The only one little thing we are missing is ptxas. Everything is available through cudatoolkit. And our cuda-enabled tensorflow builds are really good --- better than the official copies and better than even the NVIDIA NGC optimized builds (surprisingly!)

TLDR regarding test: a test is good, but the bare minimum is that a python-centric package should pass pip check and import x where x are the various things that can be imported from the package (e.g. import tensorflow for the tensorflow package)

ngam commented 2 years ago

I have looked at your recipe here: https://github.com/JaneliaSciComp/SongExplorer/tree/master/install/conda/songexplorer.

Some feedback:

  1. Please do not worry about the differentiation between the tensorflow-macos and all that --- I would highly encourage you to tell your users who want to use tensorflow-macos to just get everything from pypi (including this songgexplorer package --- are you planning to make songexplorer available on pypi?). This will also mean you could delete the post-link files.
  2. I am not sure about the symlinking of executables in build.sh: This looks like a Python package, right? But it not actually a python package yet. It is just a collection of independent files. Why don't you make it a proper python package and make it available on pypi? All you need is a pyproject.toml file to specify the build backend and then organize the files into modules that can then depend on each other. Then you can use entry points and other exotic python solutions instead of calling the files like that. However, you can still do it your current way if you want... see point 3 below
  3. If you really want to just publish executables like you have now, then copying them to $PREFIX/bin is enough.

There are other things, but these are the main obvious point for now. If you could explain to me your objectives here, I can probably help you more. We usually submit recipes to conda-forge/staged-recipes repo.

ngam commented 2 years ago

I am not sure if you know @mkitti --- we collaborate together on the julia feedstock :D

This is about the tensorflow feedstock where I am also actively involved. I will be happy to help get this up running on conda-forge soon --- @mkitti not sure how big/small your org and how many people know each other but looks like it is a party 😜

mkitti commented 2 years ago

What a small world. In normal times, Stuart's office would be within line of sight of my cubicle, but alas we are not quite in normal times.

Coincidentally, @bjarthur, to whom Stuart's comment is directed, is the other Julia one user within our department. Ben has been using my Mac Studio to develop this conda recipe. I've been giving him some advice on developing this, but I decided to wait to push him towards conda-forge until he put together a "working" conda recipe. My gambit seems to have paid off since we've attracted the attention of Stuart, who is our resident conda-forge expert.

ngam commented 2 years ago

Amazing!

For this type of work, I tend to think of a pypi package that we can then repurpose as a conda-forge recipe, but we can definitely jump to a conda-forge recipe directly :)

I will recommend an alternative container implementation for your consideration, but I have no concrete idea about the use cases and how important the GUI aspect is (I am not that knowledgeable on the GUI front)

Try this alternative Dockerfile instead of yours (I didn't test it, but it is an effective strategy, see e.g. https://uwekorn.com/2021/03/03/deploying-conda-environments-in-docker-cheatsheet.html)

FROM condaforge/mambaforge as mamba

COPY . /opt/songexplorer

# to pin your deps, use conda-lock to produce a lockfile
RUN mamba create -p /opt/env --copy --file /locks/conda-linux-64.lock && mamba clean -aqy

# or you can just follow your implementation:
# RUN mamba install boa  # boa is the mamba builder, depends on conda-build anyway
# RUN mamba build /opt/songexplorer/containers/conda/songexplorer -c conda-forge -c apple -c nvidia
# RUN mamba create -p /opt/env --copy -y --use-local songexplorer -c conda-forge -c apple -c nvidia

FROM ubuntu:22.04

COPY --from=mamba /opt/env /opt/env

ENV PATH=/opt/env/bin:$PATH 
ngam commented 2 years ago

Mac Studio

Now the whole discussion around tensorflow-macos vis-a-vis tensorflow makes sense! https://github.com/conda-forge/tensorflow-feedstock/issues/259

bjarthur commented 1 year ago

@ngam can you recommend someone to contact for help with getting a rust application into conda-forge? i'm out of ideas for getting this staged recipe to work on windows and not sure who to ping. thanks!

ngam commented 1 year ago

Hi @bjarthur, I responded in the staged recipes repo: https://github.com/conda-forge/staged-recipes/pull/22454#issuecomment-1546513403

mkitti commented 1 year ago

Have you tried the conda-forge gitter.im?

bjarthur commented 1 year ago

@ngam need your help again getting a staged recipe reviewed. could you ping someone for me again please? it's been ready for three weeks. thanks.

mkitti commented 1 year ago

Try https://app.element.io/#/room/#conda-forge:matrix.org

bjarthur commented 8 months ago

hosting it here instead: https://anaconda.org/janelia/songexplorer