lmjohns3 / theanets

Neural network toolkit for Python
http://theanets.rtfd.org
MIT License
328 stars 73 forks source link

How do I get predictions from my Classifier network #7

Closed jurcicek closed 11 years ago

jurcicek commented 11 years ago

Hi, I built a network for learning a XOR function using the following example. I just do know know how to get results/predictions from the network. Please can you help me?

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import cPickle
import gzip
import logging
import lmj.cli
import matplotlib.pyplot as plt
import numpy as np
import os
import tempfile
import theano
import theanets

lmj.cli.enable_default_logging()

X = np.array([
               [0.0, 0.0],
               [0.0, 1.0],
               [1.0, 0.0],
               [1.0, 1.0],
             ])

Y = np.array([0, 1, 1, 0, ])

print X.shape
print Y.shape

train = [X,  Y.astype('int32')]

e = theanets.Experiment(theanets.Classifier,
                        layers=(2, 5, 2),
                        activation = 'tanh',
#                        learning_rate=.005,
#                        learning_rate_decay=.1,
#                        patience=20,
                        optimize="sgd",
                        num_updates=10,
#                        tied_weights=True,
#                        batch_size=32,
                        )
e.run(train, train)

print e.network(X)
kastnerkyle commented 11 years ago

When you are printing e.network(X), the output should be the prediction of the network I.E. predicted = e.network(X)

You may also want to build up a separate test data set, and train with e.train(train, test)

What does e.network(X) look like?

Kyle On Aug 31, 2013 9:30 AM, "Filip Juricek" notifications@github.com wrote:

Hi, I built a network for learning a XOR function using the following example. I just do know know how to get results/predictions from the network. Please can you help me?

!/usr/bin/env python

-- coding: utf-8 --

import cPickle import gzip import logging import lmj.cli import matplotlib.pyplot as plt import numpy as np import os import tempfile import theano import theanets

lmj.cli.enable_default_logging()

X = np.array([ [0.0, 0.0], [0.0, 1.0], [1.0, 0.0], [1.0, 1.0], ])

Y = np.array([0, 1, 1, 0, ])

print X.shape print Y.shape

train = [X, Y.astype('int32')]

e = theanets.Experiment(theanets.Classifier, layers=(2, 5, 2), activation = 'tanh',

learning_rate=.005,

learning_rate_decay=.1,

patience=20,

                    optimize="sgd",
                    num_updates=10,

tied_weights=True,

batch_size=32,

                    )

e.run(train, train)

print e.network(X)

— Reply to this email directly or view it on GitHubhttps://github.com/lmjohns3/theano-nets/issues/7 .

jurcicek commented 11 years ago

Hi Kyle,

this is the output:

(4, 2) (4,) I 2013-08-31 21:28:12 MainProcess theanets.main:43 runtime arguments: I 2013-08-31 21:28:12 MainProcess theanets.main:45 --activation = tanh I 2013-08-31 21:28:12 MainProcess theanets.main:45 --batch_size = 64 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --cg_batches = None I 2013-08-31 21:28:12 MainProcess theanets.main:45 --decode = 1 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --global_backtracking = False I 2013-08-31 21:28:12 MainProcess theanets.main:45 --hidden_dropouts = 0 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --hidden_l1 = 0 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --hidden_l2 = 0 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --hidden_noise = 0 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --initial_lambda = 1.0 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --input_dropouts = 0 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --input_noise = 0 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --layers = (2, 5, 2) I 2013-08-31 21:28:12 MainProcess theanets.main:45 --learning_rate = 0.01 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --learning_rate_decay = 0.25 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --min_improvement = 0.01 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --momentum = 0.0 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --no_learn_biases = False I 2013-08-31 21:28:12 MainProcess theanets.main:45 --num_updates = 10 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --optimize = sgd I 2013-08-31 21:28:12 MainProcess theanets.main:45 --patience = 15 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --pool_damping = 0 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --pool_dropouts = 0 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --pool_error_start = 3 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --pool_noise = 0 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --preconditioner = False I 2013-08-31 21:28:12 MainProcess theanets.main:45 --save_progress = None I 2013-08-31 21:28:12 MainProcess theanets.main:45 --tied_weights = False I 2013-08-31 21:28:12 MainProcess theanets.main:45 --train_batches = None I 2013-08-31 21:28:12 MainProcess theanets.main:45 --valid_batches = None I 2013-08-31 21:28:12 MainProcess theanets.main:45 --validate = 3 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --weight_l1 = 0 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --weight_l2 = 0 I 2013-08-31 21:28:12 MainProcess theanets.main:85 activation: tanh I 2013-08-31 21:28:12 MainProcess theanets.feedforward:168 weights for layer 0: 2 x 5 I 2013-08-31 21:28:12 MainProcess theanets.feedforward:168 weights for layer out_0: 5 x 2 I 2013-08-31 21:28:12 MainProcess theanets.feedforward:139 27 total network parameters I 2013-08-31 21:28:12 MainProcess theanets.trainer:102 SGD: 4 named parameters to learn I 2013-08-31 21:28:14 MainProcess theanets.dataset:89 data train: 1 mini-batches of (4, 2), (4,) I 2013-08-31 21:28:14 MainProcess theanets.dataset:89 data cg: 1 mini-batches of (4, 2), (4,) I 2013-08-31 21:28:14 MainProcess theanets.dataset:89 data valid: 1 mini-batches of (4, 2), (4,) I 2013-08-31 21:28:14 MainProcess theanets.trainer:76 SGD update 1/10 @1.00e-02 train [ 0.68985299 0.25 0.25 ] I 2013-08-31 21:28:14 MainProcess theanets.trainer:76 SGD update 2/10 @1.00e-02 train [ 0.68985299 0.25 0.25 ] I 2013-08-31 21:28:14 MainProcess theanets.trainer:76 SGD update 3/10 @1.00e-02 train [ 0.68983453 0.25 0. ] valid [ 0.68981614 0.25 0. ] * I 2013-08-31 21:28:14 MainProcess theanets.trainer:76 SGD update 4/10 @1.00e-02 train [ 0.68981614 0.25 0. ] I 2013-08-31 21:28:14 MainProcess theanets.trainer:76 SGD update 5/10 @1.00e-02 train [ 0.68979792 0.25 0. ] I 2013-08-31 21:28:14 MainProcess theanets.trainer:76 SGD update 6/10 @1.00e-02 train [ 0.68977986 0.25 0. ] valid [ 0.68976196 0.25 0. ] I 2013-08-31 21:28:14 MainProcess theanets.trainer:76 SGD update 7/10 @7.50e-03 train [ 0.68976196 0.25 0. ] I 2013-08-31 21:28:14 MainProcess theanets.trainer:76 SGD update 8/10 @7.50e-03 train [ 0.68974421 0.25 0. ] I 2013-08-31 21:28:14 MainProcess theanets.trainer:76 SGD update 9/10 @7.50e-03 train [ 0.689731 0.25 0. ] valid [ 0.68971789 0.25

  1. ] I 2013-08-31 21:28:14 MainProcess theanets.trainer:76 SGD update 10/10 @5.62e-03 train [ 0.68971789 0.25 0. ]

print e.network(X)

[[ 0.0005597 -0.00049876] [ 0.00051545 0.05267641] [-0.1259047 0.03582318] [-0.09417908 0.07938591]]

This does not look like an output from a network the last layer activation function with the softmax.

Any ideas?

Best regards, Filip Jurcicek


Work tel. (CZ): +420221914402 Personal tel. (CZ): +420777805048 Skype: bozskyfilip

http://ufal.mff.cuni.cz/staff.html http://sites.google.com/site/filipjurcicek/

On 31 August 2013 21:26, Kyle Kastner notifications@github.com wrote:

When you are printing e.network(X), the output should be the prediction of the network I.E. predicted = e.network(X)

You may also want to build up a separate test data set, and train with e.train(train, test)

What does e.network(X) look like?

Kyle On Aug 31, 2013 9:30 AM, "Filip Juricek" notifications@github.com wrote:

Hi, I built a network for learning a XOR function using the following example. I just do know know how to get results/predictions from the network. Please can you help me?

!/usr/bin/env python

-- coding: utf-8 --

import cPickle import gzip import logging import lmj.cli import matplotlib.pyplot as plt import numpy as np import os import tempfile import theano import theanets

lmj.cli.enable_default_logging()

X = np.array([ [0.0, 0.0], [0.0, 1.0], [1.0, 0.0], [1.0, 1.0], ])

Y = np.array([0, 1, 1, 0, ])

print X.shape print Y.shape

train = [X, Y.astype('int32')]

e = theanets.Experiment(theanets.Classifier, layers=(2, 5, 2), activation = 'tanh',

learning_rate=.005,

learning_rate_decay=.1,

patience=20,

optimize="sgd", num_updates=10,

tied_weights=True,

batch_size=32,

) e.run(train, train)

print e.network(X)

— Reply to this email directly or view it on GitHub< https://github.com/lmjohns3/theano-nets/issues/7> .

— Reply to this email directly or view it on GitHubhttps://github.com/lmjohns3/theano-nets/issues/7#issuecomment-23612357 .

lmjohns3 commented 11 years ago

Hi Filip -

Thanks for your bug report ! It looks like the Classifier class did have an issue with its output values. I think I fixed it today in a change that I made to the feedforward.py module (see 05575422a2074f65738d73092a820edc91d14f0a for the diff) -- the issue was that the feed_forward method (which passes an input forward through the network to the output) wasn't being compiled using the softmax output.

I also made predict() a synonym for the existing __call__ method on networks, so you can write:

experiment.network.predict(test_data)

to get network predictions. Hopefully that reads better on the consumer side.

These changes only affect the output at the end of training the network -- I don't think anything about training will be changed. It's interesting that the network has such a hard time learning the XOR function, and is something I hadn't thought much about in a while. My initial guess is that you'll need to use a lot more training data to get more accurate results, but I'll look into it a little and report back.

jurcicek commented 11 years ago

Hi Laif,

I wanted to use the latest master code from your code:

however I get this error:

Traceback (most recent call last): File "./xor-classfier.py", line 34, in num_updates=10, File "/home/jurcicek/multipy/pythons/2.7/lib/python2.7/site-packages/theanets-0.1.0-py2.7.egg/theanets/main.py", line 94, in init self._build_trainers(_kw) File "/home/jurcicek/multipy/pythons/2.7/lib/python2.7/site-packages/theanets-0.1.0-py2.7.egg/theanets/main.py", line 144, in _build_trainers self.add_trainer(factory, _kwargs) File "/home/jurcicek/multipy/pythons/2.7/lib/python2.7/site-packages/theanets-0.1.0-py2.7.egg/theanets/main.py", line 155, in add_trainer factory = self.TRAINERS[factory] KeyError: 's'

I guess that this related to your last commit: c77f6cde82434c6e9257b46eff74972733dc7198

All the best!

Filip

Best regards, Filip Jurcicek


Work tel. (CZ): +420221914402 Personal tel. (CZ): +420777805048 Skype: bozskyfilip

http://ufal.mff.cuni.cz/staff.html http://sites.google.com/site/filipjurcicek/

On 4 September 2013 04:08, Leif Johnson notifications@github.com wrote:

Hi Filip -

Thanks for your bug report ! It looks like the Classifier class did have an issue with its output values. I think I fixed it today in a change that I made to the feedforward.py module (see 0557542https://github.com/lmjohns3/theano-nets/commit/05575422a2074f65738d73092a820edc91d14f0afor the diff) -- the issue was that the feed_forward method (which passes an input forward through the network to the output) wasn't being compiled using the softmax output.

I also made predict() a synonym for the existing call method on networks, so you can write:

experiment.network.predict(test_data)

to get network predictions. Hopefully that reads better on the consumer side.

These changes only affect the output at the end of training the network -- I don't think anything about training will be changed. It's interesting that the network has such a hard time learning the XOR function, and is something I hadn't thought much about in a while. My initial guess is that you'll need to use a lot more training data to get more accurate results, but I'll look into it a little and report back.

— Reply to this email directly or view it on GitHubhttps://github.com/lmjohns3/theano-nets/issues/7#issuecomment-23761410 .

lmjohns3 commented 11 years ago

Yes, I changed the optimize keyword argument to be a list rather than a string.

However, I just committed a small fix to master that will allow a string parameter for optimize, so you can do it either way.

lmjohns3 commented 11 years ago

I'm going to go ahead and close this since the original problem (getting predictions from a classifier network) has been addressed. We can track the addition of the XOR example using #9 .