Closed abramhindle closed 8 years ago
The commit fe17ba3 does address the issue with the name of the output layer.
I haven't run across the second issue you've described. If you can update your code to the current git master and try again, please update the report here with what happens!
Ok I updated to the latest greatest theanet and tried to use pretrain again.
Here's 1.2gb of test data https://archive.org/details/FFT2-to-STFT-with-data-using-theanets
Here's what doesn't get pretrained ( http://softwareprocess.es/2015/example.py ) :
import theanets
import pickle
import numpy as np
import climate
import logging
import os
climate.enable_default_logging()
# input 64*64 grayscale bitmap
# output samples 22050/30
# fft windows of 1024
# cut down to real values
# cut down again
inputs = 4096
win_size = 2048
swin_size = win_size / 2 + 1
output_size = swin_size * 2
hidlayersize = output_size #win_size
exp = theanets.Experiment(theanets.Regressor,layers=[
4096
,dict(size=hidlayersize,std=0.001,mean=0.0)
,dict(size=hidlayersize,std=0.001,mean=0.0)
,dict(size=hidlayersize,std=0.001,mean=0.0)
,output_size])
net = exp.network
logging.info("Read frames.pkl")
frames = pickle.load(file('fft-frames.pkl'))
logging.info("Read stft.pkl")
audio = pickle.load(file('stft.pkl'))
train = frames
outputs = audio
train = train.astype(np.float32)
outputs = outputs.astype(np.float32)[0:train.shape[0]]
shuffleids = np.arange(train.shape[0])
np.random.shuffle(shuffleids)
train = train[shuffleids]
outputs = outputs[shuffleids]
i = 0
logging.info("Pretraining")
net.train([train, outputs],
save_progress="current_pre_brain.pkl",
save_every=25,
batch_size=4096,
train_batches=1024,
patience = 1,
min_improvement = 0.1,
algo='pretrain',
momentum=0.9)
i = 0
for traint, validt in net.itertrain([train, outputs],
algo='nag',
learning_rate=1e-3,
save_progress="current_brain.pkl",
save_every=25,
batch_size=4096,
momentum=0.9):
print('i ',str(i))
print('training loss:', traint['loss'])
print('most recent validation loss:', validt['loss'])
print('training err:', traint['err'])
print('most recent validation err:', validt['err'])
i += 1
net.save('stft-theanet.py.net.pkl')
Here's the output (similar command, same output)
hindle1@piggy:/media/hindle1/MyMedia/deep-learning/osborne-combined-stft-both-fft2$ python stft-theanet.py
Using gpu device 0: GeForce GTX 970
I 2015-09-03 22:36:51 theanets.layers.base:462 layer Input "in": 4096 inputs
I 2015-09-03 22:36:51 theanets.layers.base:303 layer Feedforward "hid1": (in:out)4096 -> 2050, relu, 8398850 parameters
I 2015-09-03 22:36:51 theanets.layers.base:303 layer Feedforward "hid2": (hid1:out)2050 -> 2050, relu, 4204550 parameters
I 2015-09-03 22:36:51 theanets.layers.base:303 layer Feedforward "hid3": (hid2:out)2050 -> 2050, relu, 4204550 parameters
I 2015-09-03 22:36:51 theanets.layers.base:303 layer Feedforward "out": (hid3:out)2050 -> 2050, linear, 4204550 parameters
I 2015-09-03 22:36:51 theanets.graph:116 network has 21012500 total parameters
I 2015-09-03 22:36:51 root:38 Read frames.pkl
I 2015-09-03 22:37:03 root:40 Read stft.pkl
I 2015-09-03 22:37:11 root:51 Pretraining
I 2015-09-03 22:37:11 downhill.dataset:144 valid: 6 of 6 mini-batches of (4096, 4096); (4096, 2050)
I 2015-09-03 22:37:11 downhill.dataset:144 train: 1024 of 6 mini-batches of (4096, 4096); (4096, 2050)
I 2015-09-03 22:37:11 theanets.layers.base:303 layer Tied "tied-hid3": (out)2050 -> 2050, relu, 2050 parameters
I 2015-09-03 22:37:11 theanets.layers.base:303 layer Tied "tied-hid2": (out)2050 -> 2050, relu, 2050 parameters
I 2015-09-03 22:37:11 theanets.layers.base:303 layer Tied "tied-hid1": (out)2050 -> 4096, linear, 4096 parameters
I 2015-09-03 22:37:11 theanets.trainer:314 creating shadow network
I 2015-09-03 22:37:11 theanets.graph:116 network has 16816146 total parameters
I 2015-09-03 22:37:11 theanets.trainer:250 layerwise: training in -> hid1 -> tied-hid1
I 2015-09-03 22:37:11 downhill.base:378 -- patience = 1
I 2015-09-03 22:37:11 downhill.base:379 -- validate_every = 10
I 2015-09-03 22:37:11 downhill.base:380 -- min_improvement = 0.1
I 2015-09-03 22:37:11 downhill.base:381 -- max_gradient_norm = 0
I 2015-09-03 22:37:11 downhill.base:382 -- max_gradient_elem = 0
I 2015-09-03 22:37:11 downhill.base:383 -- learning_rate = 0.0001
I 2015-09-03 22:37:11 downhill.base:384 -- momentum = 0.9
I 2015-09-03 22:37:11 downhill.base:385 -- nesterov = False
I 2015-09-03 22:37:11 downhill.adaptive:220 -- rms_halflife = 14
I 2015-09-03 22:37:11 downhill.adaptive:221 -- rms_regularizer = 1e-08
I 2015-09-03 22:37:11 downhill.base:112 compiling evaluation function
I 2015-09-03 22:37:14 downhill.base:118 compiling RMSProp function
Traceback (most recent call last):
File "stft-theanet.py", line 89, in <module>
momentum=0.9)
File "build/bdist.linux-x86_64/egg/theanets/graph.py", line 400, in train
File "build/bdist.linux-x86_64/egg/theanets/graph.py", line 376, in itertrain
File "build/bdist.linux-x86_64/egg/theanets/trainer.py", line 320, in itertrain
File "build/bdist.linux-x86_64/egg/theanets/trainer.py", line 253, in itertrain
File "build/bdist.linux-x86_64/egg/theanets/trainer.py", line 66, in itertrain
File "/usr/local/lib/python2.7/dist-packages/downhill/base.py", line 397, in iterate
validation = self.evaluate(valid)
File "/usr/local/lib/python2.7/dist-packages/downhill/base.py", line 243, in evaluate
values = [self.f_eval(*x) for x in dataset]
File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 590, in __call__
self.inv_finder[c]))
TypeError: Tried to provide value for implicit input: hid1.w
So similar behaviour on GPU and CPU.
Hm, one thing I see here is that the 'pretrain'
trainer requires an unlabeled dataset as input! You could try changing either 'pretrain'
to 'layerwise'
, or change [train, outputs]
to [train]
(in your call to net.train
).
Alright, you're correct. If I experience the same issue with an autoencoder I'll reopen the issue. Thanks for your help.
On commit 33775e7c96adbe2924cadb13f0061fd648a74c46
Using the Regressor network I tried to pretrain using first 'layerwise' then 'pretrain'. There were failures in both.
First off for 'layerwise' it would get to the last hidden layer and then not resolve the output. Perhaps this is to do with the caching feature? Or perhaps layerwise makes assumptions about the output layer's name?
Perhaps fe17ba38c4daf5fa0c3986205c99abd10418b1b2 fixes this.
Then I tried to use the pretrainer and I received the following error "TypeError: Tried to provide value for implicit input: hid1.w" I do not believe the latest fix actually addresses this.
Perhaps in the recent commits layerwise was fixed, but I'm not sure about pretrain. It'll take some time before I can confirm again.