kzwkt / wnd-charm

Automatically exported from code.google.com/p/wnd-charm
0 stars 0 forks source link

Overriding unbalanced classifier isn't working correctly. #14

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago

The functionality to override the balanced class requirement isn't working as 
advertised. The help note says:

i[#]N - Set a maximal number of training images (for each class). If the '#' is 
specified then the class is ignored if it doesn't have at least N samples.

I have uploaded a .fit file that contains around 20 classes, most with 10 
images, but some with less, like 3 and 6 images. So it's not clear to me how 
one would go about doing a wndchrm test with only the 10 image classes. I tried 
`wndchrm test -i#9 ...` but it gave me the unbalanced classes error and exited. 
I also tried specifying both the -i#9 -r#0.9 trying to indicate to wndchrm to 
use only those classes with at least 9 images for training, and then use 90% of 
the images in those classes for training (see output below).

How would one go about performing the test I wanted to perform?

$ wndchrm_debug test -n50 -i#9 -r#0.9 
individual_monkey_classifier_type_I_muscle_fiber.fit full_set_WND_r82.html
Processing training set 'individual_monkey_classifier_type_I_muscle_fiber.fit'.
----------
Summary of 'individual_monkey_classifier_type_I_muscle_fiber.fit' (229 samples 
total, 1 samples per image):
'Class label' number of samples.
'101A'  11
'171A'  10
'3883'  10
'4012'  10
'4051'  6
'4065'  10
'4264'  10
'4265'  10
'74A'   9
'74C'   10
'850'   10
'BA33'  10
'D718'  3
'H2'    10
'H35'   10
'J105'  10
'J84'   10
'J89'   10
'P8A'   10
'U44'   10
'W28'   10
'WO1'   10
'WO3'   10
'Y24'   10
----------
ERROR: Specified training images (9) exceeds maximum for balanced training (2). 
 No images left for testing.
  Use -iN with N < 3.
  Or, use -rN with 1.0 < N > 0.0
  Or, use -r#N to over-ride balanced training.

Original issue reported on code.google.com by christop...@nih.gov on 1 Mar 2011 at 12:21

Attachments:

GoogleCodeExporter commented 9 years ago
Resolved and committed.
Using -i# removes small classes early before verifying train/test params 
(-i,-j,-r).  This code was removed from TrainingSet::split and moved to 
wndchrm.cpp.
With -i#, -r (# or not) is again active, specifying the split ratio.  Normally, 
-i overrides -r.
You should be able to use balanced or unbalanced splitting with -i#.

Original comment by i...@cathilya.org on 2 Mar 2011 at 4:15