dkirkby / MachineLearningStatistics

Machine learning and statistics for physicists
BSD 3-Clause "New" or "Revised" License
94 stars 53 forks source link

Possibility of making MLS a PyPI package? #3

Closed matthewfeickert closed 5 years ago

matthewfeickert commented 5 years ago

@dkirkby Would you be open to the idea of making the mls package in this repo its own standalone PyPI package so it could be installed independently of the course? Or do you really want to keep the mls code as part of the course itself and not as a separate repo?

If you're open to the idea I'd be willing to help with any migration.

dkirkby commented 5 years ago

I'm not sure how useful the package is independent of the notebooks, but it should already be possible to pip install it directly from git, e.g.

pip install git+https://github.com/dkirkby/MachineLearningStatistics

Alternatively, is there some particular functionality that would be worth spinning off into a more focused pip pkg?

matthewfeickert commented 5 years ago

@dkirkby

it should already be possible to pip install it directly from git

Indeed, that of course works well. However with regards to

there some particular functionality that would be worth spinning off into a more focused pip pkg?

I would say there are two main reasons:

  1. If other people (e.g. Mark and I) would like to contribute to some of the functionality of the mls package for similar courses to yours it would be nice if there was a way to centrally do this rather then to develop on one course and then try to figure out how those commits would transfer to another course. If there was a single Git repo with the package then contributions could be made and deployed to PyPI very easily.
  2. A very nice way to handle distribution of the course environment for people who are already using Conda is to create a conda environment.yml. This makes the installation easier for students as there is just a single command to get the whole environment. However, if there are packages that are not in Conda cloud, conda-forge, or PyPI that have been added to the environment that isn't reflected in the environment.yml and so those package can't get installed directly with all the others. It would be very nice if the entire course environment could be in a single environment.yml which is also nicer for maintaining dependencies.
matthewfeickert commented 5 years ago

@dkirkby It seems like some nice work has been done on Conda in the last year as I just found this PR for adding full pip syntax support to conda create from a environment.yml. So now it is possible in such a environment.yml to have syntax of the form (this is just a dumb example I'm making up)

name: TESTENV
channels:
  - defaults
dependencies:
  - python=3.6
  - pip>=18.0
  - setuptools<=39.1.0 # for tensorflow
  - wheel>=0.31.1
  - numpy
  - scipy
  - pandas
  - matplotlib
  - pip:
    - tensorflow
    - -e git+https://github.com/dkirkby/MachineLearningStatistics#egg=mls

So I guess the question now is just if you would want external contributions, and if so if you would wan them to just make PRs to this repo even if they aren't course related in any other way beyond the mls package, or if you would want those contributions coming from a fork of a specific mls repo.

dkirkby commented 5 years ago

@matthewfeickert I added an environment.yml a few weeks ago, but I decided to keep the pytorch and tensorflow installs as separate steps since there are some variations in package names depending on your OS and whether you have a GPU available.

I like your idea of adding the mls package to environment.yml, which I hadn't thought of.

External contributions are certainly welcome. PRs to this repo are good, and we can revisit whether to split off mls as we go.

matthewfeickert commented 5 years ago

@dkirkby Sorry to have fallen off on this! My GitHub has had a lot of other projects with activity lately.

Yes, after I added that comment I noticed that you had added the enviornment.yml on the 2nd (my bad).

External contributions are certainly welcome. PRs to this repo are good, and we can revisit whether to split off mls as we go.

Sounds good. :+1: I'll go ahead and close this issue for now as it seems everything is wrapped up here.