Closed eywalker closed 5 years ago
It seems like conda
should enable pip
. What do other libraries (e.g. tensorflow
) do for conda?
They just have completely different package management from pip, just like how yum differs from apt. Most other major packages do provide conda version as well.
On Feb 25, 2017, at 5:57 PM, Dimitri Yatsenko notifications@github.com wrote:
It seems like conda should enable pip. What do other libraries (e.g. tensorflow) do for conda?
― You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.
Yes, i have just read up on conda -- I have only used it as part of anaconda. Let's proceed then.
With #333, we are going to have explicit dependency on graphviz
. While we can talk about having graceful degradation of features to support "lightweight" installation of DataJoint (e.g. in an CLI only environment), it would make a lot of sense to move forward with conda
packaging and make that the recommended installation strategy for the users as we can add graphviz
as explicit dependency.
Even if we recommend conda
, we should provide conda-less instructions for installation. I have instructions for Ubuntu but not for MacOS or Windows. How should we go about that?
I like the installation instructions for jupyter http://jupyter.readthedocs.io/en/latest/install.html outlining both the conda
-based and the pip
-based installation.
indeed that's pretty nice - should definitely add this to Docs and Tutorial.
We need to work with anaconda to include datajoint in their stack
I read into it a little bit and it seems to be straight forward to package DataJoint for conda.
The first thing we will need to do is create the appropriate recipe file, as documented in the official conda docs.
Then we will have to submit it to conda-forge. They have a staging repository on github and accept pull requests. More details can be found in their documentation.
Lastly, and this is optional, we can try and move the recipe to anaconda's default channel. A mirror is maintained on github and they accept pull requests.
This will however only keep users from having to add the conda-forge channel to their installation (with conda config --add channels conda-forge
) before being able to install DataJoint.
I will start some work on this in the next few days.
Hey @FlorianFranzen I'm already >70% done with this process for packaging DJ into conda, so thanks but don't worry about getting into the packaging. The actual process ended up requiring additional work due to some of DataJoint's dependency not straightforwardly available in conda (yet). I don't think it's not necessary that we push it to conda-forge as we already have channel vathes
under Anaconda cloud. I guess this depends on how common it is for people to have conda-forge
channel included in their conda environment.
I do agree that we should try to get DataJoint into the Anaconda's default channel once packaging is complete.
Ok I have now packaged DataJoint (and it's necessary dependency pygraphviz
) under Conda in channel vathes
. One can install datajoint
via conda install -c vathes datajoint
on Python 3.5 or 3.6 with Linux/MacOSX. Unfortunately I haven't had chance to compile pygraphviz
under Windows and hence the lack of availability.
I now think that @FlorianFranzen had a very good point with conda-forge as they can can perform the compilation on multiple platforms automatically. Given that DataJoint itself does not have any C code, I would have not preferred adding another factor, but if we could actually add pygraphviz
recipe under conda-forge
then I think it's completely worth placing datajoint
under conda-forge
as well. It's just that I'm quite unsure whether we can provide a recipe for another OSS project that we do not maintain. I think exact same statement applies to the the official anaconda channel - we will need to somehow provide for pygraphviz
conda package.
I think it should be enough to inform the pygraphviz
maintainers about your plans, so they can add themselves to the maintainer list in the recipe, if they want to cooperate maintaining it.
There is already an issue asking for an addition to conda-forge, so just announcing it there should be enough.
Please review #353
Where do we stand with conda packaging?
Where do we stand with conda packaging?
haven't gotten to it - will take a crack at it this week.
updates?
I'd like this!
I have confirmed that I can pull in all the needed requirements for DataJoint for conda, except for the minio
python sdk. I will open an issue at minio/minio-py on this.
$ pip show datajoint
Name: datajoint
Version: 0.11.2
[...]
Requires: numpy, pyparsing, pydot, networkx, pymysql, pandas, tqdm, minio, ipython
# succeeds:
$ conda install -c defaults -c conda-forge numpy pyparsing pydot networkx 'pymysql>=0.7.2' pandas tqdm ipython # minio not available
FWIW: pygraphviz is now available from several channels, so in the event that you want to pull that dependency in by default, you can now do it (actually it is already pulled in with networkx by default).
I have quite a bit of experience with conda packaging, so happy to answer questions or do a code review on any recipe (meta.yaml) files.
I have started the process of adding minio to conda-forge (https://github.com/conda-forge/staged-recipes/pull/8517). Once it's accepted I/we can do the same for DataJoint with minimal effort.
OK, that was fast! The minio feedstock was approved and 'minio' is now available in conda-forge.
I have proposed a conda-forge package for DataJoint, and included @dimitri-yatsenko as a 'maintainer'. This means that Dimitri will be able to modify the recipe (including adding other maintainers, bump version, update requirements, etc.)
Assuming this works, the big remaining item will be to integrate conda packaging into datajoint's release process. In general, it looks like this: 1) Publish a new version of your pip package on PyPI 2) Manually update the conda recipe (meta.yaml file) in a PR at https://github.com/conda-forge/datajoint-feedstock (<-doesn't exist until the package is accepted):
Would you please make @guzman-raphael the maintainer instead?
OK, I think the wheels are all in motion over at conda-forge. ~If anyone wants to try this out before conda-forge completes their review and CI, you can try it out right now using a version of the DataJoint package on my personal channel (https://anaconda.org/tjd2002/datajoint)~ [Removed]
OK, datajoint is now available on conda-forge 🎉 , so after updating docs, I think this can be closed.
conda create --name dj --channel conda-forge datajoint
conda activate dj
python -c 'import datajoint'
If you decide to stick with shipping conda packages by conda-forge, I would suggest deleting the old packages from https://anaconda.org/vathes (to avoid confusion)
@guzman-raphael, can I have your permission to add you as a maintainer on the minio feedstock as well? I will try to keep it updated, but this way you can also push releases there if datajoint needs them.
@tjd2002, sorry just seeing this now. Yes, you can add me as a maintainer to the mino feedstock too. Thanks again! We are in the process of making a new release soon and will definitely try to include this in our process.
@tjd2002 one more thing. I am new to conda-forge
but it looks as if there is a bot that creates a PR automatically based on new updates to our PyPi module. Do you know if we can include --pre
releases into this process as well or do we need another feedstock repo? I also noticed that the auto-gen PR was merged on our behalf by a mariusvniekerk. Do you know who this is? If we are to use this for a Production process, we need to be able to restrict how these would be released. Currently, I do not seem to have the appropriate privilege to be able to see who all can merge PR's.
As you all may know, Anaconda is a very popular Python packaging environment that aims to simplify the process of getting started using Python for scientific computations. Anaconda users would use the specialized package management tools
conda
to discover and install new scientific Python packages.Unfortunately, making packages available on PyPI which is where
pip
pulls packages from does not make the package available forconda
. It would make sense for us to provide DataJoint packaging for Anaconda.