Closed gilliM closed 8 years ago
Probably not normal. Can you post a message into this thread, so that it shows up in my 'notificaitons' list, and I dont forget? (Having clicked into it, to read it, it vanishes from my notifications)
Here is the training code in case, let me know if you need some more information
targets = array.array('f', output_data)
images = array.array('f', input_data)
net.setBatchSize(batchSize)
im1 = []
for epoch in range(0, numEpochs):
print 'epoch', epoch
context = PyDeepCL.TrainingContext(epoch, 0)
for batch in range(N / batchSize):
sgd.train(
net,
context,
images[batch * batchSize * planes * size * size:],
targets[batch * batchSize * planes:])
net.forward(images[0:(batchSize * planes * size * size)])
if batch == 0:
lastLayer = net.getLastLayer()
predictions = lastLayer.getOutput()[0:(planes * batchSize)]
precision = np.mean(np.sqrt((np.array(predictions) - output_data[0:(planes * batchSize)]) ** 2))
print precision
if epoch == numEpochs - 1:
for i in range(8):
im1.extend(predictions[((i * nt) * planes):(((i * nt) + 1) * planes) ])
$ python testhugeram.py
major_version 2
major vresion 2
Traceback (most recent call last):
File "testhugeram.py", line 4, in <module>
targets = array.array('f', output_data)
NameError: name 'output_data' is not defined
I haven't written my whole code here. It's a bit long and wild. I can copy it here entirely if you are interested. But I write you here another version which use random value, just for running purposes.
import array
import PyDeepCL
import numpy as np
if __name__ == '__main__':
N = 12312
batchSize = 171
planes = 49
size = 3
numEpochs = 20
output_data = np.random.rand(N * planes * size * size)
input_data = np.random.rand(N * planes * size * size)
targets = array.array('f', input_data)
images = array.array('f', output_data)
cl = PyDeepCL.DeepCL()
net = PyDeepCL.NeuralNet(cl)
net.addLayer(PyDeepCL.InputLayerMaker().numPlanes(planes).imageSize(size))
net.addLayer(PyDeepCL.ConvolutionalMaker().numFilters(20).filterSize(3).
biased())
net.addLayer(PyDeepCL.ActivationMaker().relu())
net.addLayer(PyDeepCL.ConvolutionalMaker().numFilters(49).filterSize(1).
biased())
net.addLayer(PyDeepCL.SquareLossMaker())
sgd = PyDeepCL.SGD(cl, 0.00002, 0.0001)
print net.asString()
net.setBatchSize(batchSize)
im1 = []
for epoch in range(0, numEpochs):
print 'epoch', epoch
context = PyDeepCL.TrainingContext(epoch, 0)
for batch in range(N / batchSize):
sgd.train(
net,
context,
images[batch * batchSize * planes * size * size:],
targets[batch * batchSize * planes:])
net.forward(images[0:(batchSize * planes * size * size)])
if batch == 0:
lastLayer = net.getLastLayer()
predictions = lastLayer.getOutput()[0:(planes * batchSize)]
precision = np.mean(np.sqrt((np.array(predictions) \
- output_data[0:(planes * batchSize)]) ** 2))
print precision
Hmmm, right, I confirm I can reproduce the problem. Will take a look. By the way I tweaked the numbers very slightly, since my laptop memory is not so large, but the effect is the same: out of memory within ~10-20 seconds, some leak somewhere, will take a look.
N = 1231
batchSize = 171
planes = 49
size = 3
numEpochs = 2000
Can you post a dummy comment into this thread please, so I dont forget about it? (once I've clicked into it, it falls off my notifications, and no longer appears on my https://github.com page)
Interestingly, it only seems to affect python2.7. python3.4 seems to work ok for me. To what extent could using python3.4 be an option for you, whilst I take a look at the python2.7 issue?
Interestingly, if I precreate the batch data, and reuse it each time, the problem seems to go away, including on python 2.7. Not saying this is ideal, and should probably be fixed, but maybe it's a temporary workaround for now?
imageBatches = []
targetBatches = []
for batch in range(N // batchSize):
imageBatches.append(images[batch * batchSize * planes * size * size:])
targetBatches.append(targets[batch * batchSize * planes:])
for epoch in range(0, numEpochs):
print('epoch', epoch)
context = PyDeepCL.TrainingContext(epoch, 0)
for batch in range(N // batchSize):
sgd.train(
net,
context,
imageBatches[batch],
targetBatches[batch])
(by the way, added some input
statements every few epochs, to avoid killing the computer each time it sucks out all the memory...
from __future__ import print_function, division
import platform
import array
import PyDeepCL
import numpy as np
if __name__ == '__main__':
N = 1280
batchSize = 128
planes = 1
size = 28
numEpochs = 2000
output_data = np.random.rand(N * planes * size * size)
input_data = np.random.rand(N * planes * size * size)
targets = array.array('f', input_data)
images = array.array('f', output_data)
test_input = images[0:(batchSize * planes * size * size)]
cl = PyDeepCL.DeepCL()
net = PyDeepCL.NeuralNet(cl)
net.addLayer(PyDeepCL.InputLayerMaker().numPlanes(planes).imageSize(size))
# net.addLayer(PyDeepCL.ConvolutionalMaker().numFilters(20).filterSize(3).
# biased())
# net.addLayer(PyDeepCL.ActivationMaker().relu())
# net.addLayer(PyDeepCL.ConvolutionalMaker().numFilters(49).filterSize(1).
# biased())
net.addLayer(PyDeepCL.SquareLossMaker())
sgd = PyDeepCL.SGD(cl, 0.00002, 0.0001)
print(net.asString())
net.setBatchSize(batchSize)
im1 = []
# imageBatches = []
# targetBatches = []
# for batch in range(N // batchSize):
# imageBatches.append(images[batch * batchSize * planes * size * size:])
# targetBatches.append(targets[batch * batchSize * planes:])
for epoch in range(0, numEpochs):
print('epoch', epoch)
context = PyDeepCL.TrainingContext(epoch, 0)
for batch in range(N // batchSize):
# sgd.train(
# net,
# context,
# imageBatches[batch],
# targetBatches[batch])
test_input = images[0:(batchSize * planes * size * size)]
net.forward(test_input)
# if batch == 0:
# lastLayer = net.getLastLayer()
# predictions = lastLayer.getOutput()[0:(planes * batchSize)]
# precision = np.mean(np.sqrt((np.array(predictions) \
# - output_data[0:(planes * batchSize)]) ** 2))
# print precision
if epoch % 20 == 0:
pyversion = int(platform.python_version_tuple()[0])
print('pyversion', pyversion)
if pyversion == 2:
print('py 2')
raw_input(str(epoch))
else:
print('py 3')
input(str(epoch))
Seems to be cython related somehow:
array.array
each time, and pass word net.forward
to reproduce the problemnet.forward
is in python/NeuralNet.pyx
, and looks like: def forward(self, const float[:] images):
self.thisptr.forward(&images[0])
def forward(self, const float[:] images):
print('forward... v0.2')
... basically a nop
, with a print statement, doesnt actually fix the problem, when using python2.7. In python3.4 all is well.
To what extent could it be acceptable to workaround the issue by one of the following methods? :
array.array
s beforehand?array.array
s, and just copy the data in each time?Hmmm, why not just pass the numpy array in directly?
test_input = input_data[0:(batchSize * planes * size * size)]
net.forward(test_input)
(ie replacing images
with input_data
, etc)
My idea was to stay as close as possible to the example propose with the MNIST dataset. I've changed to numpy array (don't forget the type = np.float32).
Anyway, shouldn't the batches be
imageBatches.append(input_data[(batch * batchSize * planes * size * size):((batch + 1) * batchSize * planes * size * size)])
targetBatches.append(output_data[(batch * batchSize * planes):((batch + 1) * batchSize * planes)])
instead of
imageBatches.append(input_data[batch * batchSize * planes * size * size:])
targetBatches.append(output_data[batch * batchSize * planes:])
My idea was to stay as close as possible to the example propose with the MNIST dataset.
Ah yes, fair enough. Well, I was using arrays because it removes a dependency of the tests on numpy. But also I wasnt aware there is some kind of leak on python2.7 when using arrays.
Anyway, shouldn't the batches be
Yes, I think you're right. I noticed that yesterday, in your code, and didnt like to say anything. But now I see that it is because you copied this strange code from my own code :-D
Maybe I will create a second test script, that uses numpy arrays, and see how that goes.
Hmmm, I wonder if anyone will ever be using array.array
s instead of numpy
arrays? Tempting to remove the array.array
examples.
9c2a43f :
Maybe it is more a question than an issue, but is it normal that the RAM used keep incresing with the learning ?
With the following network, I have ~70 GB (!) of used RAM after 14 epoch (input and output are float arrays):