covartech / PRT

Pattern Recognition Toolbox for MATLAB
http://covartech.github.io/
MIT License
144 stars 70 forks source link

Feature Request: Method for obtaining the highest/lowest confidence observations from each target class #19

Open jmmalo03 opened 11 years ago

jmmalo03 commented 11 years ago

After running/training a classifier on some observations for a binary decision problem, I often like to quickly extract the most "easy" and "difficult" observations from each target class. In other words, I would like a method (or methods) that will quickly provide me with: (1) The 'n' observations with the largest decision statistic from the positive class (2) The 'n' observations with the lowest decision statistic from the positive class (3) The 'n' observations with the largest decision statistic from the negative class (4) The 'n' observations with the lowest decision statistic from the negative class

Alternatively, it would be nice to have a single method that independently sorts the observations under each target class according to their decision statistics.

peterTorrione commented 11 years ago

As spec'd out, something to get (1)-(4), I don't think this should be a method of prtDataSetClass, and it probably shouldn't be a method of prtClass or prtAction.

It shouldn't be a method of prtDataSetClass because it makes some assumptions - e.g., that you have only one feature. That you have a "positive" and "negative" class, etc.

If you have those circumstances, there's at least one quick ways to do this:

%Example, sort into H0 and H1, sorted by yOut confidence: yOut = classifier.run(ds); [sorted,inds] = sort(yOut.X); dsSort = ds.retainObservations(inds); %sort the dataSet dsSort0 = dsSort .retainClasses(0); dsSort1 = dsSort .retainClasses(1);

Now, the first N of dsSort0 are the easy H0, the last N are hard, and vice-versa for dsSort1.

One way to put some of these together might be: "sortBy": e.g. ds = ds.sortBy(sortVector,'withinClass',true);

So, you cold do: yOut = classifier.run(ds); ds = ds.sortBy(yOut.X,'withinClass',true);

But that doesn't actually save a whole ton of code...?

For now I don't see a super good reason to make a method that does the code in the example above...