scikit-hep / probfit

Cost function builder. For fitting distributions.
http://probfit.readthedocs.io/
MIT License
51 stars 30 forks source link

Make matplotlib dependency optional #76

Closed alexpearce closed 6 years ago

alexpearce commented 8 years ago

In the README, matplotlib is listed as an optional dependency , but currently probfit won't import without matplotlib being installed. This makes matplotlib a mandatory dependency.

This is now the case because the __init__ imports modules that themselves import matplotlib. There are two solutions I can think of.

  1. Don't import the offending modules in the __init__; or
  2. Move the matplotlib imports into the methods that use them. This means changing a module like
    
    import matplotlib.pyplot as plt

def foo(): return plt.figure()

to something like
```python
def foo():
    import matplotlib.pyplot as plt
    return plt.figure()

Note that option 1 will break the probfit 1.0.5 API, as users could no longer do, for example from probfit import draw_compare_hist, but rather from probfit.plotting import …. Seeing as we're not contracted by semantic versioning or anything, this isn't technically a problem, but it's something to consider. (We could wrap the plotting import in a try/except I guess, but it's a little ugly, as then the API changes depending on what's installed.)

Module-by-module

The offending modules are oneshot, toy, and plotting.

oneshot

I've not used this module. I guess the purpose is to allow a single call to do the fitting and plotting (to do things in 'one shot'). I don't understand the distinction between the fit_ series of methods and the try_ series.

Personally I don't know how useful having this is, you save a couple of lines at the expense of clarity. So, here I'm in favour of option 1; remove the oneshot import from the __init__. If people want a quick fitting-and-plotting method, they'll want matplotlib and can import probfit.oneshot themselves.

toy

Has a single method, gen_toy, that has a block that plots the PDF on top of the generated data, if the quiet kwarg is False (defaults to True). I am in favour of moving this logic to plotting, as a compare_pdf_to_data method or something.

If a user wants to do the comparison, they would then do something like:

from probfit import gen_toy
from probfit import plotting

toy_data = gen_toy(f, nsample, bound)
plotting.compare_pdf_to_data(f, toy_data)

plotting

I would really rather go with option 1, keeping the matplotlib import at the module level, and just not importing plotting in the __init__. I think this makes the most sense. matplotlib is listed as an optional dependency, and for plotting probfit uses matplotlib. No mpl, no plotting, so it doesn't make sense to expose the plotting methods at the probfit.<whatever> level.

The probfit API will then no longer container probfit.<some method from plotting>.

cdeil commented 8 years ago

I think the number of probfit users is small, and backward-compat is not a big concern. Still, if we do break it, my suggestion would be to use semver and make the next release 2.0 instead of 1.1.

My preference would be to keep exposing all functionality in the top-level probfit namespace instead of introducing new sub-packages probfit.oneshot and probfit.plotting. It's a small package, and having everything in one namespace is easier to remember or find things for users. It's also easy to implement by putting the matplotlib imports in the functions. The project I'm most familar does it the same way, they even go to great lengths that everything can be imported without matplotlib and just by importing something, pyplot doesn't start (which pops up a GUI window for the default interactive backend): https://github.com/astropy/astropy/blob/master/astropy/visualization/mpl_normalize.py#L16

There should then be a travis-ci build that doesn't have MPL and plotting tests need to get a skip decorator like we discussed in another issue yesterday.

@alexpearce - You have a preference for introducing probfit.oneshot and probfit.plotting, right?

So it's 1:1 and we need at least a third opinion. :-)

@piti118 - Please comment on this one!

alexpearce commented 8 years ago

You have a preference for introducing probfit.oneshot and probfit.plotting, right?

Well, it's one solution. I wouldn't enjoy having a lot of import matplotlib duplication, partly out of tidiness and partly for semantics. Having every method in plotting import matplotlib just… feels gross 😄

And yes, I suppose in general I'm fine with having submodules, particularly in this case when it actually ring-fences an optional subset of the functionality (things that are plotting related).

cdeil commented 8 years ago

@alexpearce - My suggestion would be that you wait a day or two for comments here, and if you get none, implement what you think best.

So if you do introduce probfit.plotting, you'll have to change the Sphinx autodoc API to show that, and as I said, I'd suggest to just change the milestone and next release version to 2.0 instead of 1.1.

eduardo-rodrigues commented 6 years ago

Hi @alexpearce, @cdeil, you may have seen that we dealt with this pragmatically, as some users could not in fact install on a Mac's virtual env because of these global imports. It's probably a good idea to review these and other matters at some point. For now let me close this as it's effectively sorted out. Non-trivial improvements can indeed come in a version 2.0.