separate test classes vs. a single unified class with keywords

rmandelb commented 10 years ago

As discussed on pull requests #29 and #30 , there is a stylistic issue that we should decide on sooner rather than later:

When making test classes for the kinds of tests that might be done with a bunch of different quantities, do we (a) Make separate tests for each of the different quantities (as for the correlation function tests, whisker plots, and scatter plots), or (b) Make a single test that has keyword arguments to decide which quantity is being tested.

My gut reaction to these PRs was to prefer something like (b). I don't like having 10 (or whatever) different class names to juggle. A single class seems cleaner to me, and would enable code like:

for plot_type in ['value','residual']:
    for quantity in ['g1', 'g2', 'sigma']:
         WhateverWeCallThisThing(quantity=quantity, plot_type=plot_type, ...)

instead of making 6 separate calls to classes with different names.

However:

I know this is a matter of taste, so others may not agree.
There is a matter of "where is it important for us to put in effort right now?", and I suspect most of you will say, with good justification, that it's more important to continue enabling new functionality rather than refactoring existing functionality. (This could change how we put in new stuff, though, and we could plan to refactor old stuff later if we want.)
I'm not sure if there is any aspect of my suggestion that would not play nicely with config scripts, or with the HSC pipeline connection.
We could imagine enabling both modes of calling the tests, i.e., have both a general class and the more specific ones, where the general one would parse the keywords to call the appropriate specific one. This would violate the whole "only one way to do things" plan.

Thoughts? @TallJimbo , I'm not sure if you saw the discussion on the PR, so I wanted to bring this to your attention too...

TallJimbo commented 10 years ago

Without looking at any of the specific code that raised this issue, my general preference would be for one class, but with a design that allowed subclasses to be used for certain values of the keyword arguments, if necessary. That is, we'd have:

one base class, which works for almost every case (maybe every case)
derived classes for specializations, as needed.
a "factory" (free function or static method on the base class) which creates an instance of the appropriate class given the keyword arguments. The class constructors would only ever be called directly by this factory function.

msimet commented 10 years ago

I will say--as the designer of the current implementation :)--that I was also thinking of loops, but an order more like:

sys_tests = [CorrelationThing1(), CorrelationThing2(), HistogramThing3(), ...]
for sys_test in sys_tests:
    sys_test(data)

which, as long as we have a unity of interface (which we do except for the StatSysTests), seems like the simpler thing to me if you're running many tests. Of course, this structure also works if we use your keyword idea, but pass the keywords to the constructor, not the analysis call:

sys_tests = [CorrelationThing1(), CorrelationThing2()]
for plot_type in ['value', 'residual']:
    for quantity in ['g1', 'g2', 'sigma']:
        sys_tests.append(WhateverWeCallThisThing(plot_type=plot_type, quantity=quantity))
for sys_test in sys_tests:
    sys_test(data)

which, at the moment, we have to do for the StatSysTests anyway (I don't think there's a way around that, given how many possible options there are for that). This perhaps seems something like an object-oriented approach vs a functional approach?

Just saw @TallJimbo's comment, will respond after group meeting...

msimet commented 10 years ago

I'm fine with a design with just one (user-facing) constructor for each broad class of SysTest, if that's what people would like. It doesn't necessarily make things easier for the correlation function sys tests (where I think we'll always be stuck with hard-to-remember lists of names somewhere--I'm fine if it's keywords rather than class names) but will probably be easier for histograms, scatter plots and whisker plots.

msimet commented 10 years ago

Leaving a note here that I've started some work on this but not finished it yet; also, I will implement a better solution to the problem of chip vs sky coordinates in the HSC module that @HironaoMiyatake raised in #29 and I made a shoddy fix for in that issue.

msimet commented 9 years ago

Hi all,

We never came to a final decision on this, so I thought I'd ping @HironaoMiyatake, @rmandelb and @TallJimbo again (as well as anyone else who's interested) and see what people think.

I'm somewhat persuaded that using one class name with keywords to make our tests, rather than using whole separate classes, is the way to go. This comes from a few design things that have either changed since our last discussion, or that I have realized make more sense this way:

In either case, we will need lists of fairly random keywords for e.g. the CorrelationFunction classes. If we have one general class per test type, then a glance at the docs for that class will have a list of the possible types and what they mean. Otherwise, it will be a combination of tab completes, quickstart documentation (when we make that) and remembering the underlying base class names which might never be used otherwise.
Otherwise, for scripting, this is possibly a little more annoying, or at least not a large benefit: in my dumb example above, we'd have

test_list = [CorrelationFunctionSysTest(type='BrightStarShear'), CorrelationFunctionSysTest(type='GalaxyStarShear'), ...]
for test in test_list:
    results = test(data)

which isn't really a huge benefit either way. (You can do this as well:

test_list = ['BrightStarShear', 'GalaxyStarShear', ...]
for test in test_list:
    results = CorrelationFunctionSysTest(type=test)(data)

but that doesn't save the object for later use if it has other methods.)

For the config-based drivers (including the HSC drivers, which are functionally config-based to the end user) we could now have input like:

tests:
    CorrelationFunctionSysTests:
        - BrightStarShear
        - GalaxyStarShear
    ScatterPlotSysTests:
        - StarVsPSFG1
        - StarVsPSFG2

I guess, actually, we could have that kind of input even now, but this way the config options would parallel the scripting setup.

Basically, I think we ended up (by necessity) with enough interface differences between types of SysTest that it makes sense to make that explicit, by having different constructors for each type, rather than having the same type of constructor for each subclass but having the diversity of created interfaces.

Thoughts from everyone else? @HironaoMiyatake, you've probably used the code in its current form the most--is this going to mess up what you've already done?

HironaoMiyatake commented 9 years ago

As a user of the current code form :), I vote for different constructors for different types of SysTest, as they are already different.

It might be nice to have some nested config form as you showed. To specify the details of a specific type of SysTest, it might be easier for user to use comma-separated form something like, e.g., for scatter plot it can be a list such as ["value,g1", "value,g2", "value,sigma", "residual,g1", "residual,g2", "residual, sigma"].

For making test_list, it would be nice to use a factory class as Jim suggested, since you can get all the instance just by a for loop over different SysTest and details of each SysTest.

TallJimbo commented 9 years ago

Given that I'm now pretty out of touch with this issue (and indeed most of Stile) I'm happy to defer to those of you who are more familiar with it, unless there's something you really don't like about your preferred solution and want another opinion.

msimet / Stile

separate test classes vs. a single unified class with keywords #32