Closed jurcicek closed 11 years ago
When you are printing e.network(X), the output should be the prediction of the network I.E. predicted = e.network(X)
You may also want to build up a separate test data set, and train with e.train(train, test)
What does e.network(X) look like?
Kyle On Aug 31, 2013 9:30 AM, "Filip Juricek" notifications@github.com wrote:
Hi, I built a network for learning a XOR function using the following example. I just do know know how to get results/predictions from the network. Please can you help me?
!/usr/bin/env python
-- coding: utf-8 --
import cPickle import gzip import logging import lmj.cli import matplotlib.pyplot as plt import numpy as np import os import tempfile import theano import theanets
lmj.cli.enable_default_logging()
X = np.array([ [0.0, 0.0], [0.0, 1.0], [1.0, 0.0], [1.0, 1.0], ])
Y = np.array([0, 1, 1, 0, ])
print X.shape print Y.shape
train = [X, Y.astype('int32')]
e = theanets.Experiment(theanets.Classifier, layers=(2, 5, 2), activation = 'tanh',
learning_rate=.005,
learning_rate_decay=.1,
patience=20,
optimize="sgd", num_updates=10,
tied_weights=True,
batch_size=32,
)
e.run(train, train)
print e.network(X)
— Reply to this email directly or view it on GitHubhttps://github.com/lmjohns3/theano-nets/issues/7 .
Hi Kyle,
this is the output:
(4, 2) (4,) I 2013-08-31 21:28:12 MainProcess theanets.main:43 runtime arguments: I 2013-08-31 21:28:12 MainProcess theanets.main:45 --activation = tanh I 2013-08-31 21:28:12 MainProcess theanets.main:45 --batch_size = 64 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --cg_batches = None I 2013-08-31 21:28:12 MainProcess theanets.main:45 --decode = 1 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --global_backtracking = False I 2013-08-31 21:28:12 MainProcess theanets.main:45 --hidden_dropouts = 0 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --hidden_l1 = 0 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --hidden_l2 = 0 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --hidden_noise = 0 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --initial_lambda = 1.0 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --input_dropouts = 0 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --input_noise = 0 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --layers = (2, 5, 2) I 2013-08-31 21:28:12 MainProcess theanets.main:45 --learning_rate = 0.01 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --learning_rate_decay = 0.25 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --min_improvement = 0.01 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --momentum = 0.0 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --no_learn_biases = False I 2013-08-31 21:28:12 MainProcess theanets.main:45 --num_updates = 10 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --optimize = sgd I 2013-08-31 21:28:12 MainProcess theanets.main:45 --patience = 15 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --pool_damping = 0 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --pool_dropouts = 0 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --pool_error_start = 3 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --pool_noise = 0 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --preconditioner = False I 2013-08-31 21:28:12 MainProcess theanets.main:45 --save_progress = None I 2013-08-31 21:28:12 MainProcess theanets.main:45 --tied_weights = False I 2013-08-31 21:28:12 MainProcess theanets.main:45 --train_batches = None I 2013-08-31 21:28:12 MainProcess theanets.main:45 --valid_batches = None I 2013-08-31 21:28:12 MainProcess theanets.main:45 --validate = 3 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --weight_l1 = 0 I 2013-08-31 21:28:12 MainProcess theanets.main:45 --weight_l2 = 0 I 2013-08-31 21:28:12 MainProcess theanets.main:85 activation: tanh I 2013-08-31 21:28:12 MainProcess theanets.feedforward:168 weights for layer 0: 2 x 5 I 2013-08-31 21:28:12 MainProcess theanets.feedforward:168 weights for layer out_0: 5 x 2 I 2013-08-31 21:28:12 MainProcess theanets.feedforward:139 27 total network parameters I 2013-08-31 21:28:12 MainProcess theanets.trainer:102 SGD: 4 named parameters to learn I 2013-08-31 21:28:14 MainProcess theanets.dataset:89 data train: 1 mini-batches of (4, 2), (4,) I 2013-08-31 21:28:14 MainProcess theanets.dataset:89 data cg: 1 mini-batches of (4, 2), (4,) I 2013-08-31 21:28:14 MainProcess theanets.dataset:89 data valid: 1 mini-batches of (4, 2), (4,) I 2013-08-31 21:28:14 MainProcess theanets.trainer:76 SGD update 1/10 @1.00e-02 train [ 0.68985299 0.25 0.25 ] I 2013-08-31 21:28:14 MainProcess theanets.trainer:76 SGD update 2/10 @1.00e-02 train [ 0.68985299 0.25 0.25 ] I 2013-08-31 21:28:14 MainProcess theanets.trainer:76 SGD update 3/10 @1.00e-02 train [ 0.68983453 0.25 0. ] valid [ 0.68981614 0.25 0. ] * I 2013-08-31 21:28:14 MainProcess theanets.trainer:76 SGD update 4/10 @1.00e-02 train [ 0.68981614 0.25 0. ] I 2013-08-31 21:28:14 MainProcess theanets.trainer:76 SGD update 5/10 @1.00e-02 train [ 0.68979792 0.25 0. ] I 2013-08-31 21:28:14 MainProcess theanets.trainer:76 SGD update 6/10 @1.00e-02 train [ 0.68977986 0.25 0. ] valid [ 0.68976196 0.25 0. ] I 2013-08-31 21:28:14 MainProcess theanets.trainer:76 SGD update 7/10 @7.50e-03 train [ 0.68976196 0.25 0. ] I 2013-08-31 21:28:14 MainProcess theanets.trainer:76 SGD update 8/10 @7.50e-03 train [ 0.68974421 0.25 0. ] I 2013-08-31 21:28:14 MainProcess theanets.trainer:76 SGD update 9/10 @7.50e-03 train [ 0.689731 0.25 0. ] valid [ 0.68971789 0.25
[[ 0.0005597 -0.00049876] [ 0.00051545 0.05267641] [-0.1259047 0.03582318] [-0.09417908 0.07938591]]
This does not look like an output from a network the last layer activation function with the softmax.
Any ideas?
Best regards, Filip Jurcicek
Work tel. (CZ): +420221914402 Personal tel. (CZ): +420777805048 Skype: bozskyfilip
http://ufal.mff.cuni.cz/staff.html http://sites.google.com/site/filipjurcicek/
On 31 August 2013 21:26, Kyle Kastner notifications@github.com wrote:
When you are printing e.network(X), the output should be the prediction of the network I.E. predicted = e.network(X)
You may also want to build up a separate test data set, and train with e.train(train, test)
What does e.network(X) look like?
Kyle On Aug 31, 2013 9:30 AM, "Filip Juricek" notifications@github.com wrote:
Hi, I built a network for learning a XOR function using the following example. I just do know know how to get results/predictions from the network. Please can you help me?
!/usr/bin/env python
-- coding: utf-8 --
import cPickle import gzip import logging import lmj.cli import matplotlib.pyplot as plt import numpy as np import os import tempfile import theano import theanets
lmj.cli.enable_default_logging()
X = np.array([ [0.0, 0.0], [0.0, 1.0], [1.0, 0.0], [1.0, 1.0], ])
Y = np.array([0, 1, 1, 0, ])
print X.shape print Y.shape
train = [X, Y.astype('int32')]
e = theanets.Experiment(theanets.Classifier, layers=(2, 5, 2), activation = 'tanh',
learning_rate=.005,
learning_rate_decay=.1,
patience=20,
optimize="sgd", num_updates=10,
tied_weights=True,
batch_size=32,
) e.run(train, train)
print e.network(X)
— Reply to this email directly or view it on GitHub< https://github.com/lmjohns3/theano-nets/issues/7> .
— Reply to this email directly or view it on GitHubhttps://github.com/lmjohns3/theano-nets/issues/7#issuecomment-23612357 .
Hi Filip -
Thanks for your bug report ! It looks like the Classifier class did have an issue with its output values. I think I fixed it today in a change that I made to the feedforward.py module (see 05575422a2074f65738d73092a820edc91d14f0a for the diff) -- the issue was that the feed_forward method (which passes an input forward through the network to the output) wasn't being compiled using the softmax output.
I also made predict()
a synonym for the existing __call__
method on networks, so you can write:
experiment.network.predict(test_data)
to get network predictions. Hopefully that reads better on the consumer side.
These changes only affect the output at the end of training the network -- I don't think anything about training will be changed. It's interesting that the network has such a hard time learning the XOR function, and is something I hadn't thought much about in a while. My initial guess is that you'll need to use a lot more training data to get more accurate results, but I'll look into it a little and report back.
Hi Laif,
I wanted to use the latest master code from your code:
however I get this error:
Traceback (most recent call last):
File "./xor-classfier.py", line 34, in
I guess that this related to your last commit: c77f6cde82434c6e9257b46eff74972733dc7198
All the best!
Filip
Best regards, Filip Jurcicek
Work tel. (CZ): +420221914402 Personal tel. (CZ): +420777805048 Skype: bozskyfilip
http://ufal.mff.cuni.cz/staff.html http://sites.google.com/site/filipjurcicek/
On 4 September 2013 04:08, Leif Johnson notifications@github.com wrote:
Hi Filip -
Thanks for your bug report ! It looks like the Classifier class did have an issue with its output values. I think I fixed it today in a change that I made to the feedforward.py module (see 0557542https://github.com/lmjohns3/theano-nets/commit/05575422a2074f65738d73092a820edc91d14f0afor the diff) -- the issue was that the feed_forward method (which passes an input forward through the network to the output) wasn't being compiled using the softmax output.
I also made predict() a synonym for the existing call method on networks, so you can write:
experiment.network.predict(test_data)
to get network predictions. Hopefully that reads better on the consumer side.
These changes only affect the output at the end of training the network -- I don't think anything about training will be changed. It's interesting that the network has such a hard time learning the XOR function, and is something I hadn't thought much about in a while. My initial guess is that you'll need to use a lot more training data to get more accurate results, but I'll look into it a little and report back.
— Reply to this email directly or view it on GitHubhttps://github.com/lmjohns3/theano-nets/issues/7#issuecomment-23761410 .
Yes, I changed the optimize
keyword argument to be a list rather than a string.
However, I just committed a small fix to master that will allow a string parameter for optimize
, so you can do it either way.
I'm going to go ahead and close this since the original problem (getting predictions from a classifier network) has been addressed. We can track the addition of the XOR example using #9 .
Hi, I built a network for learning a XOR function using the following example. I just do know know how to get results/predictions from the network. Please can you help me?