error when applying cross validation

kaichi040696 commented 6 years ago

May I ask that how to do cross-validation in Neupy? I have checked the sklearn website and I still feel so confused.

After applying the cross_val_score, the error shows as:

TypeError: If no scoring is specified, the estimator passed should have a 'score' method. The estimator Momentum(Input(4) > Relu(5) > Softmax(3), nesterov=True, addons=None, verbose=True, batch_size=128, error=categorical_crossentropy, momentum=0.99, epoch_end_signal=None, train_end_signal=None, show_epoch=1, step=0.01, shuffle_data=True) does not.

from sklearn import preprocessing
from sklearn.datasets import load_iris
from sklearn.preprocessing import OneHotEncoder
from neupy import environment
import numpy as np
from sklearn.model_selection import train_test_split
import theano
from neupy import algorithms, layers

environment.reproducible()

# # load the iris data
iris = load_iris()
print(iris.data.shape)

# # set data and target
data, target = iris.data, iris.target
# print("data",data)
# print("target",target)

# # one hot the target
target_scaler = OneHotEncoder()
target = target_scaler.fit_transform(target.reshape((-1, 1)))
target = target.todense()
# print("target after one hot", target)

# # normalize the input data
data = preprocessing.normalize(data)
# print("normalized_data: ",normalized_data)

# # split the dataset into x_train, x_test = data, y_train, y_test = target
x_train, x_test, y_train, y_test = train_test_split(
    data.astype(np.float32), target.astype(np.float32), train_size=0.8
)

print("x_train",x_train.shape)
print("x_test",x_test.shape)
print("y_train",y_train.shape)
print("y_test",y_test.shape)

# Theano is a main backend for the Gradient Descent based algorithms in NeuPy.
theano.config.floatX = 'float32'

#
# # create new transfer function
# class Squared(layers.ActivationLayer):
#     def activation_function(self, input_value):
#         return 1/(1+ np.exp(-input_value))

# create new transfer function
class tansig(layers.ActivationLayer):
    def activation_function(self, input_value):
        return 2/(1+np.exp(-2*input_value))-1

# start the model architecture

network = algorithms.Momentum(
    [
        layers.Input(4),
        layers.Relu(5), #Relu
        # Squared(300),
        # layers.Relu(300), #Relu
        layers.Softmax(3), #Softmax
        # layers.Input(784),
        # tansig(500),
        # tansig(500),

    ],
    error='categorical_crossentropy',
    step=0.01,
    verbose=True,
    shuffle_data=True,
    momentum=0.99,
    nesterov=True,
)
# print the architecture(Input shape, Layer Type, Output shape)
network.architecture()

# # train the network
# network.train(x_train, y_train, x_test, y_test, epochs=300)

from sklearn.model_selection import KFold, cross_val_score
kfold = KFold(n_splits=10, shuffle=True, random_state=7)
results = cross_val_score(network, y_train, y_test, cv=kfold)
print("Baseline: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))

itdxer commented 6 years ago

Example for the cross_val_score function requires all X and y samples. You don't have to split it. I will do it for you with the data that you will provide

from sklearn import datasets, linear_model
from sklearn.model_selection import cross_val_score
diabetes = datasets.load_diabetes()
X = diabetes.data[:150]
y = diabetes.target[:150]
lasso = linear_model.Lasso()
print(cross_val_score(lasso, X, y))

from http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.cross_val_score.html

itdxer commented 6 years ago

And you also have to pass X and y, not only y

Also, it would be better if you use KFold alone, cross_val_score will re-train the same network without resetting it.

from sklearn.model_selection import KFold
X = np.array([[1, 2], [3, 4], [1, 2], [3, 4]])
y = np.array([1, 2, 3, 4])
kf = KFold(n_splits=2)
kf.get_n_splits(X)

print(kf)  

for train_index, test_index in kf.split(X):
   # Initialize your network here

   print("TRAIN:", train_index, "TEST:", test_index)
   X_train, X_test = X[train_index], X[test_index]
   y_train, y_test = y[train_index], y[test_index]

   # Train and validate your network here

kaichi040696 commented 6 years ago

First of all, thank you for your reply.

May I ask that is it the best idea is to split the dataset into training, validation and testing set? Not only training and validation? If I would like to print the confusion matrix as well, do I still need to keep the test set?

Secondly, after I apply the cross_val_score and it still not working. The error shows as:

TypeError: If no scoring is specified, the estimator passed should have a 'score' method. The estimator Momentum(Input(4) > Relu(5) > Softmax(3), batch_size=128, error=categorical_crossentropy, shuffle_data=True, train_end_signal=None, momentum=0.99, step=0.01, verbose=True, epoch_end_signal=None, nesterov=True, addons=None, show_epoch=1) does not.

My code:

from sklearn import preprocessing
from sklearn.datasets import load_iris
from sklearn.preprocessing import OneHotEncoder
from neupy import environment
import numpy as np
from sklearn.model_selection import train_test_split
import theano
from neupy import algorithms, layers

environment.reproducible()

# # load the iris data
iris = load_iris()
print(iris.data.shape)

# # set data and target
data, target = iris.data, iris.target
# print("data",data)
# print("target",target)

# # one hot the target
target_scaler = OneHotEncoder()
target = target_scaler.fit_transform(target.reshape((-1, 1)))
target = target.todense()
# print("target after one hot", target)

# # normalize the input data
data = preprocessing.normalize(data)
# print("normalized_data: ",normalized_data)

# # split the dataset into x_train, x_test = data, y_train, y_test = target
x_train, x_test, y_train, y_test = train_test_split(
    data.astype(np.float32), target.astype(np.float32), train_size=0.8
)

print("x_train",x_train.shape)
print("x_test",x_test.shape)
print("y_train",y_train.shape)
print("y_test",y_test.shape)

# Theano is a main backend for the Gradient Descent based algorithms in NeuPy.
theano.config.floatX = 'float32'

#
# # create new transfer function
# class Squared(layers.ActivationLayer):
#     def activation_function(self, input_value):
#         return 1/(1+ np.exp(-input_value))

# create new transfer function
class tansig(layers.ActivationLayer):
    def activation_function(self, input_value):
        return 2/(1+np.exp(-2*input_value))-1

# start the model architecture

network = algorithms.Momentum(
    [
        layers.Input(4),
        layers.Relu(5), #Relu
        # Squared(300),
        # layers.Relu(300), #Relu
        layers.Softmax(3), #Softmax
        # layers.Input(784),
        # tansig(500),
        # tansig(500),

    ],
    error='categorical_crossentropy',
    step=0.01,
    verbose=True,
    shuffle_data=True,
    momentum=0.99,
    nesterov=True,
)

# print the architecture(Input shape, Layer Type, Output shape)
network.architecture()

# # # K-Flod # # #
# from sklearn.model_selection import KFold
# kf = KFold(n_splits=2)
# kf.get_n_splits(x_train)
# for train_index, test_index in kf.split(x_train):
#     # Initialize your network here
#
#     print("TRAIN:", train_index, "TEST:", test_index)
#     X_train, X_test = x_train[train_index], x_train[test_index]
#     y_train, y_test = x_test[train_index], x_test[test_index]
# # # train the network
#     network.train(x_train, y_train, x_test, y_test, epochs=300)

# # # cros_validation # # #
from sklearn.model_selection import KFold, cross_val_score
# kfold = KFold(n_splits=10, shuffle=True, random_state=7)
results = cross_val_score(network, data, target)
print("Baseline: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))

# check the accuracy
from sklearn import metrics

y_predicted = network.predict(x_test).argmax(axis=1)
y_test = np.asarray(y_test.argmax(axis=1)).reshape(len(y_test))

print("y_predicted",y_predicted)
print("y_test",y_test)
print(metrics.classification_report(y_test, y_predicted))

score = metrics.accuracy_score(y_test, y_predicted)
print("Validation accuracy: {:.2%}".format(score))

# plot the image
from neupy import plots
plots.error_plot(network)

kaichi040696 commented 6 years ago

There is another question about my model. This model I do not apply cross-validation. I got 100% accuracy, but I plot the Error/Epoch plot shows that my validation is better than my training. Is that overfit or this model is fine?

iris

My code:

from sklearn import preprocessing
from sklearn.datasets import load_iris
from sklearn.preprocessing import OneHotEncoder
from neupy import environment
import numpy as np
from sklearn.model_selection import train_test_split
import theano
from neupy import algorithms, layers

environment.reproducible()

# # load the iris data
iris = load_iris()
print(iris.data.shape)

# # set data and target
data, target = iris.data, iris.target
# print("data",data)
# print("target",target)

# # one hot the target
target_scaler = OneHotEncoder()
target = target_scaler.fit_transform(target.reshape((-1, 1)))
target = target.todense()
# print("target after one hot", target)

# # normalize the input data
data = preprocessing.normalize(data)
# print("normalized_data: ",normalized_data)

# # split the dataset into x_train, x_test = data, y_train, y_test = target
x_train, x_test, y_train, y_test = train_test_split(
    data.astype(np.float32), target.astype(np.float32), train_size=0.8
)

# Theano is a main backend for the Gradient Descent based algorithms in NeuPy.
theano.config.floatX = 'float32'

# start the model architecture
network = algorithms.Momentum(
    [
        layers.Input(4),
        layers.Relu(5), #Relu
        # Squared(300),
        # layers.Relu(300), #Relu
        layers.Softmax(3), #Softmax
        # layers.Input(784),
        # tansig(500),
        # tansig(500),
    ],
    error='categorical_crossentropy',
    step=0.01,
    verbose=True,
    shuffle_data=True,
    momentum=0.99,
    nesterov=True,
)

# print the architecture(Input shape, Layer Type, Output shape)
network.architecture()

network.train(x_train, y_train, x_test, y_test, epochs=250)

# check the accuracy
from sklearn import metrics

y_predicted = network.predict(x_test).argmax(axis=1)
y_test = np.asarray(y_test.argmax(axis=1)).reshape(len(y_test))

print("y_predicted",y_predicted)
print("y_test",y_test)
print(metrics.classification_report(y_test, y_predicted))

score = metrics.accuracy_score(y_test, y_predicted)
print("Validation accuracy: {:.2%}".format(score))

# plot the image
from neupy import plots
plots.error_plot(network)

# confusion matrix
from sklearn.metrics import confusion_matrix
print(confusion_matrix(y_test, y_predicted))

itdxer commented 6 years ago

TypeError: If no scoring is specified, the estimator passed should have a 'score' method.

I haven't noticed it before, but error message is pretty clear. Because you didn't specify scoring argument scikit-learn tries to get score method and cannot find it and after that it fails.

from documentation: http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.cross_val_score.html

scoring : string, callable or None, optional, default: None

    A string (see model evaluation documentation) or a scorer callable 
    object / function with signature scorer(estimator, X, y).

I got 100% accuracy, but I plot the Error/Epoch plot shows that my validation is better than my training. Is that overfit or this model is fine?

Plot looks ok. Doesn't look like overfitting. It could be that you problem is simple and you can easily achieve 100% accuracy. Another explanation is that you have duplicates or nearly-duplicates that appear in train and test split. In this way validation set could be just a replica of the training and won't reflect properly whether network overfitted or not. In this case you have to do split yourself in order to avoid this issue.

itdxer / neupy

error when applying cross validation #212