Open peterTorrione opened 11 years ago
The following works as expected: yOut = kfolds(prtClassPlsda,ds,3); yOut = yOut.catFeatures(prtDataSetClass(randn(size(yOut.X)))); yOut.featureNames
I think I can make the sneakier yOut.X command work without errors, but there seems to be a deeper problem with it. getFeatureNames would be ignorant of the difference between yOut.X = cat(2,yOut.X,randn(size(yOut.X))); and yOut.X = cat(2,randn(size(yOut.X)),yOut.X);
Seems like the best behavior here would be for setData to wipe out any existing feature names. Maybe we should make a resetFeatureNames method?
I'm not sure what the status of this is; we might need to re-factor feature names?
I think the best solution would be to have setData erase any existing feature names (maybe only if the number of features changes). Does this sound reasonable?
Erasing feature names seems like a cop out to me. I think one of our issues is that we haven't fully laid out tests and performance goals. A re-factor might be necessary.
Agree. This bug (and the other EXPLORE problem) seem to point towards the need for a re-vamped unit-test framework...
The desired behavior of feature name and observation names are actually quite subtle when you start thinking about all the different ways you can modify data sets. I'm not sure what the long-term solution is. And I'm not even sure what the short-term solution is for this particular bug...
The problem is that any time someone sets the data directly, we have no way of knowing how they've scrambled the features or observations. Short of comparing the old data to the new data and doing some sort of inference, I don't see how we can do anything more intelligent than throwing a warning to say "reset your feature and observation names."
We definitely don't want to try and do inference...
But I think it's even more subtle than that. Even catFeatures or retainFeatures can act in ways that might confuse people:
ds1 = prtDataGenUnimodal; ds2 = ds1.retainFeatures(2); fprintf('%s = %s?\n',ds1.featureNames{2},ds2.featureNames{1})
Feature 2 = Feature 1?
Now, I know why we do that, but that decision was made from an implementation point of view; not from a user experience point of view. It's very tricky to get everything to work right.
Code: yOut = kfolds(prtClassPlsda,ds,3); yOut.X = cat(2,yOut.X,randn(size(yOut.X))); yOut.featureNames
Index exceeds matrix dimensions.
Error in prtDataSetStandard/getFeatureNames (line 209) if ~isempty(obj.featureNameModificationFunction) && obj.featureNameModificationMask(iFeat)
Error in prtDataSetStandard/get.featureNames (line 58) fn = self.getFeatureNames;