Closed rmandelb closed 6 years ago
Without looking at any of the specific code that raised this issue, my general preference would be for one class, but with a design that allowed subclasses to be used for certain values of the keyword arguments, if necessary. That is, we'd have:
I will say--as the designer of the current implementation :)--that I was also thinking of loops, but an order more like:
sys_tests = [CorrelationThing1(), CorrelationThing2(), HistogramThing3(), ...]
for sys_test in sys_tests:
sys_test(data)
which, as long as we have a unity of interface (which we do except for the StatSysTests
), seems like the simpler thing to me if you're running many tests. Of course, this structure also works if we use your keyword idea, but pass the keywords to the constructor, not the analysis call:
sys_tests = [CorrelationThing1(), CorrelationThing2()]
for plot_type in ['value', 'residual']:
for quantity in ['g1', 'g2', 'sigma']:
sys_tests.append(WhateverWeCallThisThing(plot_type=plot_type, quantity=quantity))
for sys_test in sys_tests:
sys_test(data)
which, at the moment, we have to do for the StatSysTest
s anyway (I don't think there's a way around that, given how many possible options there are for that). This perhaps seems something like an object-oriented approach vs a functional approach?
Just saw @TallJimbo's comment, will respond after group meeting...
I'm fine with a design with just one (user-facing) constructor for each broad class of SysTest, if that's what people would like. It doesn't necessarily make things easier for the correlation function sys tests (where I think we'll always be stuck with hard-to-remember lists of names somewhere--I'm fine if it's keywords rather than class names) but will probably be easier for histograms, scatter plots and whisker plots.
Leaving a note here that I've started some work on this but not finished it yet; also, I will implement a better solution to the problem of chip vs sky coordinates in the HSC module that @HironaoMiyatake raised in #29 and I made a shoddy fix for in that issue.
Hi all,
We never came to a final decision on this, so I thought I'd ping @HironaoMiyatake, @rmandelb and @TallJimbo again (as well as anyone else who's interested) and see what people think.
I'm somewhat persuaded that using one class name with keywords to make our tests, rather than using whole separate classes, is the way to go. This comes from a few design things that have either changed since our last discussion, or that I have realized make more sense this way:
CorrelationFunction
classes. If we have one general class per test type, then a glance at the docs for that class will have a list of the possible types and what they mean. Otherwise, it will be a combination of tab completes, quickstart documentation (when we make that) and remembering the underlying base class names which might never be used otherwise.test_list = [CorrelationFunctionSysTest(type='BrightStarShear'), CorrelationFunctionSysTest(type='GalaxyStarShear'), ...]
for test in test_list:
results = test(data)
which isn't really a huge benefit either way. (You can do this as well:
test_list = ['BrightStarShear', 'GalaxyStarShear', ...]
for test in test_list:
results = CorrelationFunctionSysTest(type=test)(data)
but that doesn't save the object for later use if it has other methods.)
tests:
CorrelationFunctionSysTests:
- BrightStarShear
- GalaxyStarShear
ScatterPlotSysTests:
- StarVsPSFG1
- StarVsPSFG2
I guess, actually, we could have that kind of input even now, but this way the config options would parallel the scripting setup.
Basically, I think we ended up (by necessity) with enough interface differences between types of SysTest that it makes sense to make that explicit, by having different constructors for each type, rather than having the same type of constructor for each subclass but having the diversity of created interfaces.
Thoughts from everyone else? @HironaoMiyatake, you've probably used the code in its current form the most--is this going to mess up what you've already done?
As a user of the current code form :), I vote for different constructors for different types of SysTest, as they are already different.
It might be nice to have some nested config form as you showed. To specify the details of a specific type of SysTest, it might be easier for user to use comma-separated form something like, e.g., for scatter plot it can be a list such as ["value,g1", "value,g2", "value,sigma", "residual,g1", "residual,g2", "residual, sigma"]
.
For making test_list
, it would be nice to use a factory class as Jim suggested, since you can get all the instance just by a for loop over different SysTest and details of each SysTest.
Given that I'm now pretty out of touch with this issue (and indeed most of Stile) I'm happy to defer to those of you who are more familiar with it, unless there's something you really don't like about your preferred solution and want another opinion.
As discussed on pull requests #29 and #30 , there is a stylistic issue that we should decide on sooner rather than later:
When making test classes for the kinds of tests that might be done with a bunch of different quantities, do we (a) Make separate tests for each of the different quantities (as for the correlation function tests, whisker plots, and scatter plots), or (b) Make a single test that has keyword arguments to decide which quantity is being tested.
My gut reaction to these PRs was to prefer something like (b). I don't like having 10 (or whatever) different class names to juggle. A single class seems cleaner to me, and would enable code like:
instead of making 6 separate calls to classes with different names.
However:
Thoughts? @TallJimbo , I'm not sure if you saw the discussion on the PR, so I wanted to bring this to your attention too...