pybrain / pybrain

BSD 3-Clause "New" or "Revised" License
2.86k stars 790 forks source link

A ClassificationDataSet become SupervisedDataset after splitWithProportion(), does it correct? #197

Open Robert-Lu opened 8 years ago

Robert-Lu commented 8 years ago

In the "Classification with Feed-Forward Neural Networks"

alldata = ClassificationDataSet(2, 1, nb_classes=3)
...
...
tstdata, trndata = alldata.splitWithProportion( 0.25 )

trndata._convertToOneOfMany( )
tstdata._convertToOneOfMany( )

However, this code get AttributeError: 'SupervisedDataSet' object has no attribute '_convertToOneOfMany'

And I found after splitWithProportion(), the returning dataset is object of SupervisedDataSet.

What I do to deal with it is convert first and then split.

Robert-Lu commented 8 years ago

I read the relating code of pybrain and found the method splitWithProportion is derived from SupervisedDataset, which will return two SupervisedDataset.

samorajp commented 8 years ago

Same problem here when trying to complete tutorial http://pybrain.org/docs/tutorial/fnn.html.

Quick&dirty solution: paste to pybrain.datasets.classification, into ClassificationDataSet class definition.

def splitWithProportion(self, proportion = 0.5):
        """Produce two new datasets, the first one containing the fraction given
        by `proportion` of the samples."""
        indicies = random.permutation(len(self))
        separator = int(len(self) * proportion)

        leftIndicies = indicies[:separator]
        rightIndicies = indicies[separator:]

        leftDs = ClassificationDataSet(inp=self['input'][leftIndicies].copy(),
                                   target=self['target'][leftIndicies].copy())
        rightDs = ClassificationDataSet(inp=self['input'][rightIndicies].copy(),
                                    target=self['target'][rightIndicies].copy())
        return leftDs, rightDs
01ghost13 commented 7 years ago
    def splitWithProportion(self, proportion = 0.5):
            indicies = random.permutation(len(self))
            separator = int(len(self) * proportion)

            leftIndicies = indicies[:separator]
            rightIndicies = indicies[separator:]

            leftDs = self.__class__(inp=self['input'][leftIndicies].copy(),
                               target=self['target'][leftIndicies].copy())
            rightDs = self.__class__(inp=self['input'][rightIndicies].copy(),
                                target=self['target'][rightIndicies].copy())
            return leftDs, rightDs

It's better to change superclass like this

ASVorobiev commented 7 years ago

Also, you need to change: indicies = random.permutation(len(self)) to indicies = permutation(len(self)) because at the head of the file from numpy.random import permutation

in python 3.5 and PyBrain (0.3.3)

gregvds commented 6 years ago

Same problem here trying to have the classification tutorial to work.

I locally defined this:

new splitWithProportion method for ClassificationDataSet

from numpy import random def splitWithProportion(dataSet, proportion = 0.5): """Produce two new datasets, the first one containing the fraction given by proportion of the samples.""" indicies = random.permutation(len(dataSet)) separator = int(len(dataSet) * proportion) nClasses = dataSet.nClasses

    leftIndicies = indicies[:separator]
    rightIndicies = indicies[separator:]

    leftDs = ClassificationDataSet(inp=dataSet['input'][leftIndicies].copy(),
                               target=dataSet['target'][leftIndicies].copy(), nb_classes = nClasses)
    rightDs = ClassificationDataSet(inp=dataSet['input'][rightIndicies].copy(),
                                target=dataSet['target'][rightIndicies].copy(), nb_classes = nClasses)
    return leftDs, rightDs

for it to work. Not sure if passing the nb_classes argument extracted from the dataset is useful. If not, the 01ghost13 is a better solution imho.