kzwkt / wnd-charm

Automatically exported from code.google.com/p/wnd-charm
0 stars 0 forks source link

If the user specifies a test set fit, the entire training set fit should be used for training #11

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago

wndchrm still uses the default split ratio of 0.75 even when the user specifies 
a second .fit file for testing.

workaround: specify -r1.0, ex:
wndchrm test -r1.0 training_set.fit training_set.fit

Original issue reported on code.google.com by christop...@nih.gov on 18 Feb 2011 at 6:41

GoogleCodeExporter commented 9 years ago
I don't think this is a bug, or at least this functionality was intentional.
There is no reason to preclude testing with random splits if you specify a test 
set.  All of the samples in test set will be used for *testing*, but the 
*training* will still be done using the command line params.

What you're talking about is the default for "classify".  There, the default 
ratio is 1.0 (the exact equivalent of -r1, but not -r#1, which would allow 
unbalanced training).

Still don't know how to mark these as "not a bug".

Original comment by i...@cathilya.org on 18 Feb 2011 at 2:55

GoogleCodeExporter commented 9 years ago
Technically, -r1 is not necessarily equivalent for "test" and "classify".  Only 
if your training set is already balanced.  If its not balanced, then classify 
-r1 will chose the *first* images in each class of the training set until all 
the classes are balanced, while test -r1 will select images *at random*.  So, 
only train -r#1 and classify -r#1 are exactly equivalent for unbalanced 
training sets.  A lower -r for test will be a random selection, while for 
classify it will be again in sample order as given in the .fit, file-o-files, 
etc.

Original comment by i...@cathilya.org on 18 Feb 2011 at 3:28

GoogleCodeExporter commented 9 years ago
I don't feel comfortable changing the behavior of this functionality 
unnecessarily. Yes, the use of classify "seems" more logically elegant to us. 
But changing behavior on users between releases is jarring. I was jarred and 
pissed off that the thing didn't work as before. You said it yourself, users 
make great developers. The default should be -r1.0 in this particular use case, 
with the option to specify otherwise.

Original comment by christop...@nih.gov on 18 Feb 2011 at 3:36

GoogleCodeExporter commented 9 years ago
Checking wndchrm 1.27, you are right.  By a bit of convoluted code, the 
side-effect is that test with a testset does use all of the training samples as 
well as all of the test samples.
It is not documented anywhere that this is what will happen.
In fact we can't guarantee that this will happen unless the user made a 
balanced training set, or specified -r#
Isn't using test this way just a side-effect of having a disfunctional classify 
in the previous release?
I am still reluctant to change the logic until I can see that we can both be 
compatible with previous behavior and maintain logical consistency.  This has 
to be reflected in the various parameter options, the documentation, and 
eventually the code.  We're not there yet, and I don't want to introduce an 
inconsistency solely for the benefit of having the same behavior as before.  
I'm not saying its not possible - I'm saying that its going to take more than 
simply adding an exception that amounts to "when I say test with a testset, 
what I really mean is classify".  How do I go about seeing the effect of random 
splits on a specified testset?

Original comment by i...@cathilya.org on 19 Feb 2011 at 3:28

GoogleCodeExporter commented 9 years ago
OK, how about a warning issued if the command is 'test', a testset was 
specified, and -r is less than 1 because of defaults or CLI options.  Something 
like this:

WARNING: Change from previous versions when specifying a testset with the test 
command.
  Previously the entire dataset was used for training (balanced or not) when a testset was specified,
  making it impossible to test random splits on a specified testset.  This was undocumented behavior.
  If the old behavior is desired, use the classify command.

Would that have saved you a headache if this appeared at the end of your run?

Original comment by i...@cathilya.org on 19 Feb 2011 at 3:47

GoogleCodeExporter commented 9 years ago
Honestly, this is a sticky issue but ultimately not that important.  For now, I 
committed code that issues the warning above.  Its not the end of the world to 
change the default -r when specifying a testset - I could live with that *if* 
it was prominently documented.  Generally, I'm not a big fan of software doing 
things behind my back, and I would not expect this change in default just by 
specifying a testset.  I would expect it to behave the same as if I did not 
specify one.  It would also be awkward to prominently document a change to 
previously undocumented behavior.

If we are to be very serious about consistency with previous behavior, we have 
to keep in mind that even with a default -r of 1, the behavior will only be the 
same as before if and only if the dataset is balanced to begin with.  And this 
is true regardless of specifying a testset.

Original comment by i...@cathilya.org on 19 Feb 2011 at 4:46

GoogleCodeExporter commented 9 years ago
Man, this stupid thing just won't go away.  Why -o- why?
The warning stinks, because it always comes up and its annoying.
The previous behavior did not comply with previous documentation!
There was no indication that the split ratio will be changed for you behind the 
scenes if you specified a test set.
The fact that it did, while the documentation implied that it wouldn't is a bug!
Relying on buggy undocumented behavior is generally a bad idea.

I'm sorry that it caused you a headache (I really am!), but I can't see how 
being consistent with previous undocumented, buggy behavior is a solution.

Let's ask other users "When doing test with a specified test set, do you expect 
all of the training set samples to be used even if you didn't say so, or do you 
expect the fraction to be the same as if you didn't specify a testset?".  This 
must be asked in the context of understanding what the classify command is for.

If the consensus answer comes back with "I expect it to use all of the training 
set when I specify a testset, even though I'm aware of what the classify 
command does" then we change the -r default (and document it!!).  Otherwise we 
don't.  If there is no consensus, then simplicity prevails - we do not change 
the defaults for the user behind their back.

Original comment by i...@cathilya.org on 19 Feb 2011 at 5:49

GoogleCodeExporter commented 9 years ago

Original comment by i...@cathilya.org on 15 Apr 2011 at 7:02