jorgenkg / python-neural-network

This is an efficient implementation of a fully connected neural network in NumPy. The network can be trained by a variety of learning algorithms: backpropagation, resilient backpropagation and scaled conjugate gradient learning. The network has been developed with PYPY in mind.
BSD 2-Clause "Simplified" License
297 stars 98 forks source link

unable to save a model #9

Closed omerarshad closed 8 years ago

omerarshad commented 8 years ago

When you try to save trained weights, it gives error as some parameters are not initialized.

jorgenkg commented 8 years ago

No, I never added it to the code base. I just posted a snippet for you in this chat. This code calculates the same result, updated to the newest version. Just add the method to the NeuralNet class.

def first_layer_deltas(self, trainingset, cost_function ):
    assert softmax_function != self.layers[-1][1] or cost_function == softmax_neg_loss,\
        "When using the `softmax` activation function, the cost function MUST be `softmax_neg_loss`."
    assert cost_function != softmax_neg_loss or softmax_function == self.layers[-1][1],\
        "When using the `softmax_neg_loss` cost function, the activation function in the final layer MUST be `softmax`."

    training_data           = np.array( [instance.features for instance in trainingset ] )
    training_targets        = np.array( [instance.targets  for instance in trainingset ] )

    input_signals, derivatives  = self.update( training_data, trace=True )                  
    out                         = input_signals[-1]
    cost_derivative             = cost_function(out, training_targets, derivative=True).T
    delta                       = cost_derivative * derivatives[-1]

    layer_indexes               = range( len(self.layers) )[::-1]    # reversed

    for i in layer_indexes:
        # i!= 0 because we don't want calculate the delta unnecessarily.
        weight_delta        = np.dot( self.weights[ i ][1:,:], delta ) # Skip the bias weight

        # Calculate the delta for the subsequent layer
        delta               = weight_delta * (1 if i == 0 else derivatives[i-1])
    #end weight adjustment loop

    return delta
# end gradient

How to:

dataset             = [ Instance( [0,0], [0] ), Instance( [1,0], [1] ), Instance( [0,1], [1] ), Instance( [1,1], [0] ) ]
network.first_layer_deltas( dataset, cost_function ) # return the deltas
omerarshad commented 8 years ago

Hi!

I have a question regarding scipy optimizer, for 10 iterations it calls function to be optimized many times.. what does this mean? and why it calls function more than 10 times?

On Sun, Apr 17, 2016 at 11:41 PM, Jørgen Grimnes notifications@github.com wrote:

No, I never added it to the code base. I just posted a snippet for you in this chat. This code calculates the same result, updated to the newest version. Just add the method to the NeuralNet class.

def first_layer_delta(self, trainingset, cost_function ): assert softmax_function != self.layers[-1][1] or cost_function == softmax_neg_loss,\ "When using the softmax activation function, the cost function MUST be softmax_neg_loss." assert cost_function != softmax_neg_loss or softmax_function == self.layers[-1][1],\ "When using the softmax_neg_loss cost function, the activation function in the final layer MUST be softmax."

training_data           = np.array( [instance.features for instance in trainingset ] )
training_targets        = np.array( [instance.targets  for instance in trainingset ] )

input_signals, derivatives  = self.update( training_data, trace=True )
out                         = input_signals[-1]
cost_derivative             = cost_function(out, training_targets, derivative=True).T
delta                       = cost_derivative * derivatives[-1]

layer_indexes               = range( len(self.layers) )[::-1]    # reversed

for i in layer_indexes:
    # i!= 0 because we don't want calculate the delta unnecessarily.
    weight_delta        = np.dot( self.weights[ i ][1:,:], delta ) # Skip the bias weight

    # Calculate the delta for the subsequent layer
    delta               = weight_delta * (1 if i == 0 else derivatives[i-1])
#end weight adjustment loop

return delta# end gradient

How to:

dataset = [ Instance( [0,0], [0] ), Instance( [1,0], [1] ), Instance( [0,1], [1] ), Instance( [1,1], [0] ) ] network.first_layer_delta( dataset, cost_function ) # return the deltas

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/jorgenkg/python-neural-network/issues/9#issuecomment-211078118

jorgenkg commented 8 years ago

I'm iterpreting from you question that it is minimize() that calls the function repeatedly?

I've never worked with the SciPy code base, and I have unfortunately no idea how it operates.

omerarshad commented 8 years ago

yes exactly, that minimize() keeps on calling the objective function many times.

On Wed, Apr 20, 2016 at 11:45 PM, Jørgen Grimnes notifications@github.com wrote:

I'm iterpreting from you question that it is minimize() that calls the function repeatedly?

I've never worked with the SciPy code base, and I have unfortunately no idea how it operates.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/jorgenkg/python-neural-network/issues/9#issuecomment-212554576

jorgenkg commented 8 years ago

Maybe StackOverflow can help you out?

omerarshad commented 8 years ago

I am implementing SGD, where i need to declare a mini batch, but if i use scipy i can't control no. of epocs. anyways thanks a lot for your help :)

On Wed, Apr 20, 2016 at 11:51 PM, Jørgen Grimnes notifications@github.com wrote:

Maybe StackOverflow can help you out?

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/jorgenkg/python-neural-network/issues/9#issuecomment-212556505

omerarshad commented 8 years ago

Hi agai ! if i call network.get_Weights and network.gradient, does this mean that i get following arrays weight [...] gradient[...] which means that gradient[i] corresponds to weight[i]? and i can update weights manually using the respective gradient?

On Wed, Apr 20, 2016 at 11:53 PM, omer arshad omer.arshad20@gmail.com wrote:

I am implementing SGD, where i need to declare a mini batch, but if i use scipy i can't control no. of epocs. anyways thanks a lot for your help :)

On Wed, Apr 20, 2016 at 11:51 PM, Jørgen Grimnes <notifications@github.com

wrote:

Maybe StackOverflow can help you out?

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/jorgenkg/python-neural-network/issues/9#issuecomment-212556505

omerarshad commented 8 years ago

Getting this error in conjugate gradient method

"nn/learning_algorithms.py:317: RuntimeWarning: invalid value encountered in double_scalars comparison = 2 * delta * (f_old - f_new)/np.power( phi, 2 )"

On Thu, Apr 21, 2016 at 5:56 PM, omer arshad omer.arshad20@gmail.com wrote:

Hi agai ! if i call network.get_Weights and network.gradient, does this mean that i get following arrays weight [...] gradient[...] which means that gradient[i] corresponds to weight[i]? and i can update weights manually using the respective gradient?

On Wed, Apr 20, 2016 at 11:53 PM, omer arshad omer.arshad20@gmail.com wrote:

I am implementing SGD, where i need to declare a mini batch, but if i use scipy i can't control no. of epocs. anyways thanks a lot for your help :)

On Wed, Apr 20, 2016 at 11:51 PM, Jørgen Grimnes < notifications@github.com> wrote:

Maybe StackOverflow can help you out?

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/jorgenkg/python-neural-network/issues/9#issuecomment-212556505

omerarshad commented 8 years ago

How to use your back-propogation as a SGD?

On Thu, May 5, 2016 at 5:08 PM, omer arshad omer.arshad20@gmail.com wrote:

Getting this error in conjugate gradient method

"nn/learning_algorithms.py:317: RuntimeWarning: invalid value encountered in double_scalars comparison = 2 * delta * (f_old - f_new)/np.power( phi, 2 )"

On Thu, Apr 21, 2016 at 5:56 PM, omer arshad omer.arshad20@gmail.com wrote:

Hi agai ! if i call network.get_Weights and network.gradient, does this mean that i get following arrays weight [...] gradient[...] which means that gradient[i] corresponds to weight[i]? and i can update weights manually using the respective gradient?

On Wed, Apr 20, 2016 at 11:53 PM, omer arshad omer.arshad20@gmail.com wrote:

I am implementing SGD, where i need to declare a mini batch, but if i use scipy i can't control no. of epocs. anyways thanks a lot for your help :)

On Wed, Apr 20, 2016 at 11:51 PM, Jørgen Grimnes < notifications@github.com> wrote:

Maybe StackOverflow can help you out?

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/jorgenkg/python-neural-network/issues/9#issuecomment-212556505

jorgenkg commented 8 years ago

Hello @omerarshad!

I'm sorry for taking so long to respond lately, but I'm down to my last four weeks writing my thesis.

SGD is implemented in the dev branch :)

omerarshad commented 8 years ago

So i can trust it? And best of luck for your thesis :)

On Thu, May 12, 2016 at 4:42 PM, Jørgen Grimnes notifications@github.com wrote:

Hello @omerarshad https://github.com/omerarshad!

I'm sorry for taking so long to respond lately, but I'm down to my last four weeks writing my thesis.

SGD is implemented in the dev branch :)

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/jorgenkg/python-neural-network/issues/9#issuecomment-218733291

jorgenkg commented 8 years ago

Hey, thanks :) Just drop me a message if something breaks!

omerarshad commented 8 years ago

Hi!

When i call network.gradient() method, it returns first order derivative, what changes i need to make so that it returns 2nd order derivative?

Thanks, Omer

On Thu, May 12, 2016 at 8:08 PM, Jørgen Grimnes notifications@github.com wrote:

Hey, thanks :) Just drop me a message if something breaks!

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/jorgenkg/python-neural-network/issues/9#issuecomment-218787077

jorgenkg commented 8 years ago

I'm not sure. I've never worked with second order (hessian) learning i neural networks. I think the hessian is often approximated, rather than analytically calculated.

omerarshad commented 8 years ago

So is it possible to convert gradient provided by your network to hessian matrix?

On Mon, Jun 6, 2016 at 8:24 PM, Jørgen Grimnes notifications@github.com wrote:

I'm not sure. I've never worked with second order (hessian) learning i neural networks. I think the hessian is often approximated, rather than analytically calculated.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jorgenkg/python-neural-network/issues/9#issuecomment-223992831, or mute the thread https://github.com/notifications/unsubscribe/APalCf2SJoD2LjHngPjtIcOMKACauQDmks5qJDucgaJpZM4Hyxym .

jorgenkg commented 8 years ago

The code returns the Jacobian (first order derivatives). However, this paper by Yu and Wilamowski seem to approximate the Hessian using the first order derivatives. You might have som luck reading their work.

Yu, H., and B. M. Wilamowski. "Neural network training with second order algorithms." Human–Computer Systems Interaction: Backgrounds and Applications 2. Springer Berlin Heidelberg, 2012. 463-476.

omerarshad commented 8 years ago

Hi jorgen! I hope you are doing good in your thesis, How to calculate gradient for input layer in your code?

Omer.

On Mon, Jun 6, 2016 at 8:38 PM, Jørgen Grimnes notifications@github.com wrote:

The code returns the Jacobian (first order derivatives). However, this paper by Yu and Wilamowski http://www.eng.auburn.edu/%7Ewilambm/pap/2011/Neural%20Network%20Training%20with%20Second%20Order%20Algorithms.PDF seem to approximate the Hessian using the first order derivatives. You might have som luck reading their work.

Yu, H., and B. M. Wilamowski. "Neural network training with second order algorithms." Human–Computer Systems Interaction: Backgrounds and Applications 2. Springer Berlin Heidelberg, 2012. 463-476.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jorgenkg/python-neural-network/issues/9#issuecomment-223997485, or mute the thread https://github.com/notifications/unsubscribe/APalCaNesEvy_B4mF_TzN-xZRTLd-_h6ks5qJD76gaJpZM4Hyxym .

jorgenkg commented 8 years ago

Thanks, the thesis was just submitted!

The code for calculating the deltas has been posted a few times over in this comment thread, however it required a slight modification:

def network_deltas( network, trainingset, cost_function ):
    import numpy as np
    training_data              = np.array( [instance.features for instance in trainingset ] )
    training_targets           = np.array( [instance.targets  for instance in trainingset ] )
    layer_indexes              = range( len(network.layers) )[::-1]    # reversed

    input_signals, derivatives = network.update( training_data, trace=True )
    out                        = input_signals[-1]
    cost_derivative            = cost_function(out, training_targets, derivative=True).T
    delta                      = cost_derivative * derivatives[-1]

    deltas                     = [ delta ]
    for i in layer_indexes:
        delta = np.dot( network.weights[ i ][1:,:], delta ) * (1 if i == 0 else derivatives[i-1])
        deltas.append( delta )
    #end

    return deltas[::-1] # the deltas of [ layer1, layer2, .., layerN ]
#end

# return a list of the gradients of each layer
print network_deltas( network, training_data, cost_function )
omerarshad commented 8 years ago

Hey!

I hope you will be doing good. How to cite your implementation in my thesis? as i am done with experiments and now want to cite your work in my thesis document.

Thanks, Omer

On Wed, Jun 15, 2016 at 4:21 PM, Jørgen Grimnes notifications@github.com wrote:

Thanks, the thesis was just submitted!

The code for calculating the deltas has been posted a few times over in this comment thread, however it required a slight modification:

def network_deltas( network, trainingset, cost_function ): import numpy as np training_data = np.array( [instance.features for instance in trainingset ] ) training_targets = np.array( [instance.targets for instance in trainingset ] ) layer_indexes = range( len(network.layers) )[::-1] # reversed

input_signals, derivatives = network.update( training_data, trace=True )
out                        = input_signals[-1]
cost_derivative            = cost_function(out, training_targets, derivative=True).T
delta                      = cost_derivative * derivatives[-1]

deltas                     = [ delta ]
for i in layer_indexes:
    delta = np.dot( network.weights[ i ][1:,:], delta ) * (1 if i == 0 else derivatives[i-1])
    deltas.append( delta )
#end

return deltas[::-1] # the deltas of [ layer1, layer2, .., layerN ]#end

return a list of the gradients of each layerprint network_deltas( network, training_data, cost_function )

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jorgenkg/python-neural-network/issues/9#issuecomment-226159309, or mute the thread https://github.com/notifications/unsubscribe/APalCeWrD0hS7YsCTtrViH_sf-hkv9BTks5qL-BCgaJpZM4Hyxym .

jorgenkg commented 8 years ago

Hello again,

Congratulations! I personally feel no need for you to cite my work, but if you're interested in extending your reference list, you should cite it as a webpage directed at the project website.

Best regards

omerarshad commented 8 years ago

hello! Do u plan to implement adaptive learning rate technique?

On Tue, Jul 12, 2016 at 4:32 PM, Jørgen Grimnes notifications@github.com wrote:

Hello again,

Congratulations! I personally feel no need for you to cite my work, but if you're interested in extending your reference list, you should cite it as a webpage directed at the project website.

Best regards

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jorgenkg/python-neural-network/issues/9#issuecomment-232016333, or mute the thread https://github.com/notifications/unsubscribe/APalCSUBQIvYSFDaIjARCnZdVgBpg9Ngks5qU3tHgaJpZM4Hyxym .

jorgenkg commented 8 years ago

Hi,

It's implemented (but not documented) in the 'dev' branch. Take a look at the file named example.py in the top directory:)

I'm in the process of thoroughly writing documentation for readthedocs, but I'm not done yet.

The various algorithms are listed in: learning_algorithms/back propagation/variations

omerarshad commented 8 years ago

can you demonstrate how to use adam? Because there is no documentation for it. I'll be very thankful to you

On Sat, Jul 23, 2016 at 3:48 PM, Jørgen Grimnes notifications@github.com wrote:

Hi,

It's implemented (but not documented) in the 'dev' branch. Take a look at the file named example.py in the top directory:)

I'm in the process of thoroughly writing documentation for readthedocs, but I'm not done yet.

The various algorithms are listed in: learning_algorithms/back propagation/variations

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jorgenkg/python-neural-network/issues/9#issuecomment-234712051, or mute the thread https://github.com/notifications/unsubscribe-auth/APalCYZOZ-fCCLtgWoob-aNe4Nxuel8iks5qYfGOgaJpZM4Hyxym .

jorgenkg commented 8 years ago
from nimblenet.activation_functions import sigmoid_function
from nimblenet.cost_functions import cross_entropy_cost
from nimblenet.learning_algorithms import *
from nimblenet.neuralnet import NeuralNet
from nimblenet.data_structures import Instance
from nimblenet.tools import print_test

training_data = test_data = [ Instance( [0,0], [0] ), Instance( [1,0], [1] ), Instance( [0,1], [1] ), Instance( [1,1], [1] ) ]
cost_function       = cross_entropy_cost
settings            = {
    "n_inputs"              : 2,       # Number of network input signals
    "layers"                : [  (3, sigmoid_function), (1, sigmoid_function) ],
}

# initialize the neural network
network             = NeuralNet( settings )

# Train the network using backpropagation
Adam(
        network,                            # the network to train
        training_data,                      # specify the training set
        test_data,                          # specify the test set
        cost_function,                      # specify the cost function to calculate error

        ERROR_LIMIT             = 1e-2,     # define an acceptable error limit 
        #max_iterations         = 100,      # continues until the error limit is reach if this argument is skipped

        batch_size              = 0,        # 1 := no batch learning, 0 := entire trainingset as a batch, anything else := batch size
        print_rate              = 1000,     # print error status every `print_rate` epoch.
        save_trained_network    = False,     # Whether to write the trained weights to disk

        # Adam specific parameters
        beta1 = 0.9, beta2 = 0.999, epsilon = 1e-8
    )

# Print a network test
print_test( network, training_data, cost_function )
jorgenkg commented 8 years ago

I've just published a more complete documentation for the code on Read the Docs, if the code example didn't help you out.