gdcc / pyDataverse

Python module for Dataverse Software (dataverse.org).
http://pydataverse.readthedocs.io/
MIT License
63 stars 41 forks source link

pyDataverse conda package #159

Closed PennyHow closed 1 month ago

PennyHow commented 1 year ago

I would like to request pyDataverse to be made available as a conda package, as well as the already available pip package. This is mainly because we use pyDataverse in our packages and would like to distribute these as conda packages, however, this requires all dependencies to have conda distributions.

I have managed to make a conda-forge version of the most recent pyDataverse verison 0.3.1 using grayskull, which generates a build script from the pypi package (this is similar to these instructions on how to make conda builds from pypi packages).

$ grayskull pypi --strict-conda-forge pydataverse

The conda package can then be built from the generated meta.yaml file.

$ conda-build pydataverse

I tested this local build (linux-64, on Python 3.8) and it seems to be working fine (make sure that the local environment for the build is included in conda channels).

$ conda install --use-local pydataverse
$ conda list

And finally the package can be uploaded to Anaconda like so.

$ anaconda upload pydataverse-0.3.1-py_0.tar.bz2

I know you would probably want versions for different platforms and Python iterations. I'm happy to help with continuing this if needed. If this is something that you do not want to pursue then I'm also happy to make pydataverse as a conda package and distribute it myself... but I feel it is better if it comes from you guys!

pdurbin commented 1 year ago

@JR-1991 @skasberger what do you think?!?

Should we take up @PennyHow on her generous offer to make pyDataverse into a Conda package and distribute it herself?!?

@PennyHow if you go ahead with this, would we be able to switch the package to coming from Dataverse (or the GDCC) in the future?

@PennyHow are aware that there is also EasyDataverse at https://pypi.org/project/easyDataverse/ ? @JR-1991 is the author.

pdurbin commented 1 year ago

@izahn probably has opinions about this stuff. He's pretty into Conda, last I checked! šŸ˜„

izahn commented 1 year ago

The conda community has largely consolidated around conda-forge as the go-to distribution channel for community supported packages (i.e., builds not maintained by Anaconda Inc.). The conda-forge community is large and active, and has a lot of nice tooling for building package for multiple Python version, automatically creating pull requests for package updates, and more.

If you want to distribute you package through conda-forge (recommended by me!) the next step is to open a PR at https://github.com/conda-forge/staged-recipes, following the instructions in the README there. In the meta.yaml file of your package you can list all the co-maintainers -- since I'm familiar with conda-forge I would be happy to co-maintain it with you, feel free to tag me if you open a PR.

PennyHow commented 1 year ago

@PennyHow if you go ahead with this, would we be able to switch the package to coming from Dataverse (or the GDCC) in the future?

You can through the anaconda upload method I outlined previously. I have not checked through the approach from @izahn but I am think it should be possible.

I'm happy to start this process through this approach through conda-forge. The main things I need to know are:

@PennyHow are aware that there is also EasyDataverse at https://pypi.org/project/easyDataverse/ ? @JR-1991 is the author.

No, I had not seen this before! Thanks for pointing it out - this may be a lightweight option that we could use in the future instead of pyDataverse. Would you prefer us to use this and focus our efforts on getting a conda distribution for EasyDataverse?

izahn commented 1 year ago

@PennyHow if you go ahead with this, would we be able to switch the package to coming from Dataverse (or the GDCC) in the future?

You can through the anaconda upload method I outlined previously. I have not checked through the approach from @izahn but I am think it should be possible.

I'm not sure what "coming from Dataverse" means, nor am I familiar with GDCC. In terms of conda packages, yes you can create and distribute packages through your own conda channel, but I don't recommend this. Joining forces and distributing your package in the conda-forge channel is recommended because it makes it easy to keep your package compatible with the rest of the vast array of packages already distributed through the conda-forge channel.

izahn commented 1 year ago
  • What platforms do you want it distributed on? Windows, Linux, macOS, or all three?
  • What Python versions do you want it distributed on? For our package, we would like 3.8, 3.9 and 3.10 so any extras just let me know

This package probably can/should be noarch as documented in https://conda-forge.org/docs/maintainer/knowledge_base.html#noarch-python . This will simplify maintenance since it only has to be built once and doesn't need separate versions for each OS or Python version.

PennyHow commented 1 year ago

I have been playing around with it and think I have a staged recipe of pyDataverse ready to open a PR to https://github.com/conda-forge/staged-recipes. Of course, I will wait until you have come to an agreement @pdurbin @skasberger @JR-1991. It should be easy enough to do this with easyDataverse instead/in addition if you wish.

The main thing is the meta.yaml. Feel free to take a look and edit if needed. The GDCC organisation is listed as a maintainer (although I need to check that organisations can be linked) along with myself and @izahn.

pdurbin commented 1 year ago

@PennyHow thanks! I don't know Conda but maybe @izahn can take a look at that yml file. It seems reasonable to me (but yeah, I dunno if orgs can be maintainers).

@JR-1991 any thoughts on having EasyDataverse in Conda?

JR-1991 commented 1 year ago

@pdurbin that would definitely also make sense for EasyDataverse. I would add it once the new version is out šŸ™‚

izahn commented 1 year ago

@PennyHow https://github.com/PennyHow/staged-recipes/blob/pydataverse/recipes/pydataverse/meta.yaml looks good, though I don't think organizations can be maintainers, only individuals. I definitely think this is ready to be submitted to https://github.com/conda-forge/staged-recipes . Conda-forge volunteers will review it and let us know if there are any issues.

PennyHow commented 1 year ago

Thanks @izahn. I'll edit the meta.yaml file and take out the organisation as a maintainer. @pdurbin, would you like to be listed as a maintainer instead? Just so that there is some representation from the organisation in the maintainers list?

pdurbin commented 1 year ago

@PennyHow oh sure, you can throw me on there. I'll try not to break anything. šŸ˜… Thanks!

PennyHow commented 1 year ago

It works! pyDataverse can now be installed as follows:

conda install pydataverse -c conda-forge

@pdurbin, would you like me to make a first attempt PR for updating the pyDataverse documentation to include the conda install option?

pdurbin commented 1 year ago

Nice! Yes, I see it here: https://anaconda.org/conda-forge/pydataverse

@PennyHow sure! Yes, it would be nice to point out in the pyDataverse docs that the package is now available on Conda! Thanks!

PennyHow commented 1 year ago

PR is here https://github.com/gdcc/pyDataverse/pull/160