mahehu / decmeg

2nd place submission to the MEG decoding competition https://www.kaggle.com/c/decoding-the-human-brain
Other
18 stars 8 forks source link

IndexError: tuple index out of range #1

Open sarwatfatimam opened 8 years ago

sarwatfatimam commented 8 years ago

Hi, I am trying to use LinearSVC + Randomforest in your code. My data has 578 trials, 70 channels and 11 features but it is throwing me the following error. Can you tell me what is the problem?

Traceback (most recent call last): File "", line 1, in File "C:\Users\Sarwat\Anaconda3\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 685, in runfile execfile(filename, namespace) File "C:\Users\Sarwat\Anaconda3\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 85, in execfile exec(compile(open(filename, 'rb').read(), filename, 'exec'), namespace) File "C:/Users/Sarwat/Desktop/EEG/EEG.py", line 352, in estimateCvScore = estimateCvScore, File "C:/Users/Sarwat/Desktop/EEG/EEG.py", line 274, in run clf.fit(X[trainIdx, :, :,np.newaxis], y[trainIdx], X[testIdx,:,:])
File "C:\Users\Sarwat\Desktop\EEG\IterativeTrainer.py", line 141, in fit self.clf.fit(X_aug, y_aug) File "C:\Users\Sarwat\Desktop\EEG\LrCollection.py", line 147, in fit for col in range(X.shape[1]): IndexError: tuple index out of range

mahehu commented 8 years ago

It looks like your X_aug is an empty matrix. Maybe there is some problem in reading your data?

Heikki

sarwatfatimam commented 8 years ago

I think data is not the problem because if I just change LinearSVC to logistic regression or even SVC with non-linear kernel using the same dataset, it gives me results without any error. It throws error on LinearSVC only.

sarwatfatimam commented 8 years ago

Can the problem be with the relabel weight or relabelThr in the below code? Maybe that is why its giving an empty matrix? How can I select an optimal value?

Also, if substitute = False and iter =1, it will skip the if statement in below code and go straight to else statement? Am I right? I have a little confusion with this statement > "substitute: If True, original training samples are discarded on second training iteration. Otherwise, test samples are appended to training data."

for idx in range(X_test.shape[0])

            confidence = np.abs(p[idx] - 0.5)
            w = int(self.relabelWeight * (confidence > self.relabelThr))

            X_new += [X_test[idx, ...]] * w
            y_new += [np.round(p[idx])] * w

        if self.substitute:

            # The training set is completely substituted with the test samples.

            X_aug = np.array(X_new)
            y_aug = np.array(y_new)

        else:

            # The training set augmented with the found test samples.

            X_aug = np.concatenate([X, X_new]) 
            y_aug = np.concatenate([y, y_new])                

        # Train a model with the augmented training set.

        self.clf.fit(X_aug, y_aug)
mahehu commented 8 years ago

LinearSVC does not support multiclass, while the others do. Try substituting LinearSVC by sklearn.multiclass.OneVsRestClassifier(LinearSVC()). If it helps, this is the reason.