Closed anaritam closed 9 years ago
Hi Ana,
Something like the following should help:
ds1 = prtDataGenMarysSimpleSixClass; ds2 = prtDataGenMarysSimpleSixClass; dsTotal = catFeatures(ds1,ds2); %total of 4 features, 6 classes nFolds = 3;
% Find the best 3 features using a KNN classifier: knn = prtClassKnn; featSel = prtFeatSelSfs('nFeatures',3,'evaluationMetric',@(ds)prtEvalPercentCorrect(knn,ds,nFolds)); featSel = featSel.train(dsTotal);
dsSelected = featSel.run(dsTotal); plot(dsSelected) %Has the best 3 features!
Hi, Can't use what you said in my data :x This is what I have:
dataSet = prtDataSetClass(features_train,labels_train); nStdRemove = prtOutlierRemovalNStd('runMode','removeObservation'); nStdRemove = nStdRemove.train(dataSet); dataSetNew = nStdRemove.run(dataSet);
featSel = prtFeatSelSfs; % Create a feature selction object featSel.nFeatures = 3; % Select only one feature of the data featSel = featSel.train(dataSetNew); % Train the feature selection object outDataSet = featSel.run(dataSetNew);
features_train is a nSamples x 7 matrix and labels_train is a nSamples x 1 matrix
My code can't run the last code line and says
"Error using prtClass/determineMaryOutput (line 310) M-ary classification is not supported by this classifier. You will need to use prtClassBinaryToMaryOneVsAll() or an equivalent M-ary emulation classifier."
Hello,
I think you need to do two things:
1) Specify a classifier that can handle M-ary data (e.g., prtClassKnn) 2) Specify an evaluation that scores multi-class outputs (e.g., prtEvalPercentCorrect)
For example:
knn = prtClassKnn; featSel = prtFeatSelSfs('nFeatures',3,'evaluationMetric',@(ds)prtEvalPercentCorrect(knn,ds,nFolds)); featSel = featSel.train(dsTotal);
Ok I manage to do this. I used
featSel = prtFeatSelSfs('nFeatures',nFeatures_used,'evaluationMetric',@(ds)prtEvalPercentCorrect(prtClassMap,ds)); featSel = featSel.train(dataSet); outDataSet = featSel.run(dataSet);
my question now is: I can use this outDataSet like this: classifier_7 = prtClassMap+ prtDecisionMap; classifier_7 = classifier_7.train(outDataSet); % Train classified_7 = run(classifier_7, dataSet_test);
in order to test the classifier?
Hi,
You need to also run:
OutDataSet_test = featSel.run(dataSet_test); [...] classified_7 = run(classifier_7, OutDataSet_test );
To apply the feature selection to your test dataset, otherwise the two data sets will have different numbers of features.
-Pete
What I did was actually this
selectedFeatures = featSel.selectedFeatures; dataSet_test=retainFeatures(dataSet,selectedFeatures);
classifier_7 = prtClassMap+ prtDecisionMap; classifier_7 = classifier_7.train(outDataSet); % Train classified_7 = run(classifier_7, dataSet_test);
It's the same thing right?
Yes, that looks right.
I keep having this error
Error using prtRvMvn/logPdf (line 184) SIGMA must be symmetric and positive definite;
Error in prtRv/runAction (line 249) DataSet = DataSet.setObservations(Obj.logPdf(DataSet));
Error in prtAction/run (line 250) dsOut = runAction(self, dsOut);
Error in prtClassMap/runAction (line 119) logLikelihoods(:,iY) = getObservations(run(self.rvs(iY), ds));
Error in prtAction/run (line 250) dsOut = runAction(self, dsOut);
Error in prtAction/crossValidate (line 369) outputDataSetCell{uInd} = trainedAction.run(testDs);
Error in prtAction/kfolds (line 553) [outputs{:}] = self.crossValidate(ds,keys);
Error in prtUtilEvalParseAndRun (line 35) Results = classifier.kfolds(dataSet,nFolds);
Error in prtEvalPercentCorrect (line 58) results = prtUtilEvalParseAndRun(classifier,dataSet,nFolds);
Error in @(ds)prtEvalPercentCorrect(prtClassMap,ds)
Error in prtFeatSelSfs/trainAction (line 149) cPerformance(i) = Obj.evaluationMetric(tempDataSet);
Error in prtAction/train (line 221) self = trainAction(self, ds);
whenever I try to use more than 2 features in the prtFeatSelSfs function. Don't understand it so I can't solve it...
Thanks for you help, Ana
Hello,
This is technically a new issue, so please start a new issue for additional comments. But it sounds like your features are not linearly independent, or you have too few observations for at least one class in your data set.
prtRvMvn is trying to learn a covariance matrix from your data - e.g.,
cov(X(Y == 1,:))
And the result of this needs to be positive semi-definite, or it's impossible to learn a Multi-Variate Normal Gaussian variable...
You might try using a simpler classifier - e.g., KNN, which does not require a full-rank covariance matrix...
Hi,
Is there a way to use feature selection with a dataSet with 3 classes (besides prtFeatSelStatic) ?
I'm using a DataSet with 7 features and 3 classes, I would like my code to choose from all 7 features, the ones that work better.
Thanks, Ana