Closed ramyabandaru closed 8 years ago
From what I understand, you have modified the load_data() function from logistic_sgd.py and implemented it in convolutional_mlp.py Regardless of where the function is, it won't work because the entire code uses theano shared variables to carry out operations , and your datasets_0,datasets_1,datasets_2 are just regular dictionaries. You have to convert it to theano shared variables first.
Here's the modified version of your code which should work (I haven't tested yet )
def load_data(datset):
train_batch = ["data_batch_1","data_batch_2","data_batch_3","data_batch_4"]
valid_batch = "data_batch_5"
test_batch = "test_batch"
train_set = {}
valid_set = {}
test_set = {}
for i in train_batch:
with open(dataset+i,'rb') as f:
if not train_set:
train_set = pickle.load(f,encoding='latin1')
continue
temp = pickle.load(f,encoding='latin1')
train_set['data']=numpy.concatenate((train_set['data'],temp['data']),axis=0)
train_set['labels'].extend(temp['labels'])
with open(dataset+valid_batch,'rb') as f:
valid_set = pickle.load(f,encoding='latin1')
with open(dataset+test_batch,'rb') as f:
test_set = pickle.load(f,encoding='latin1')
def shared_dataset(data_xy, borrow=True):
""" Function that loads the dataset into shared variables
The reason we store our dataset in shared variables is to allow
Theano to copy it into the GPU memory (when code is run on GPU).
Since copying data into the GPU is slow, copying a minibatch everytime
is needed (the default behaviour if the data is not in a shared
variable) would lead to a large decrease in performance.
"""
data_x=data_xy['data']
data_y=data_xy['labels']
shared_x = theano.shared(numpy.asarray(data_x,
dtype=theano.config.floatX),
borrow=borrow)
shared_y = theano.shared(numpy.asarray(data_y,
dtype=theano.config.floatX),
borrow=borrow)
# When storing data on the GPU it has to be stored as floats
# therefore we will store the labels as ``floatX`` as well
# (``shared_y`` does exactly that). But during our computations
# we need them as ints (we use labels as index, and if they are
# floats it doesn't make sense) therefore instead of returning
# ``shared_y`` we will have to cast it to int. This little hack
# lets ous get around this issue
return shared_x, T.cast(shared_y, 'int32')
test_set_x, test_set_y = shared_dataset(test_set)
valid_set_x, valid_set_y = shared_dataset(valid_set)
train_set_x, train_set_y = shared_dataset(train_set)
rval = [(train_set_x, train_set_y), (valid_set_x, valid_set_y),
(test_set_x, test_set_y)]
return rval
def sgd_optimization_mnist(learning_rate=0.13, n_epochs=1000,
dataset='/home/ubuntu/Desktop/ramya/cifar-10-batches-py/',
batch_size=600):
"""
Demonstrate stochastic gradient descent optimization of a log-linear
model
This is demonstrated on MNIST.
:type learning_rate: float
:param learning_rate: learning rate used (factor for the stochastic
gradient)
:type n_epochs: int
:param n_epochs: maximal number of epochs to run the optimizer
:type dataset: string
:param dataset: the path of the MNIST dataset file from
http://www.iro.umontreal.ca/~lisa/deep/data/mnist//home/ubuntu/Desktop/ramya/cifar-10-batches-py
"""
datasets_0,datasets_1,datasets_2 = load_data(dataset)
train_set_x, train_set_y = datasets_0
valid_set_x, valid_set_y = datasets_1
test_set_x, test_set_y = datasets_2
Thanks @amit4111989 !! But I am getting an error "Nontype" object is not iterable after I did changes to the code as you mentioned
It was missing a return statement. I tested it on python2 and it worked (removed encoding keywords from pickle.load() function ). Here' the updated code with a few corrections
def load_data(datset):
train_batch = ["data_batch_1","data_batch_2","data_batch_3","data_batch_4"]
valid_batch = "data_batch_5"
test_batch = "test_batch"
train_set = {}
valid_set = {}
test_set = {}
for i in train_batch:
with open(dataset+i,'rb') as f:
if not train_set:
train_set = pickle.load(f,encoding='latin1')
continue
temp = pickle.load(f,encoding='latin1')
train_set['data']=numpy.concatenate((train_set['data'],temp['data']),axis=0)
train_set['labels'].extend(temp['labels'])
with open(dataset+valid_batch,'rb') as f:
valid_set = pickle.load(f,encoding='latin1')
with open(dataset+test_batch,'rb') as f:
test_set = pickle.load(f,encoding='latin1')
def shared_dataset(data_xy, borrow=True):
""" Function that loads the dataset into shared variables
The reason we store our dataset in shared variables is to allow
Theano to copy it into the GPU memory (when code is run on GPU).
Since copying data into the GPU is slow, copying a minibatch everytime
is needed (the default behaviour if the data is not in a shared
variable) would lead to a large decrease in performance.
"""
data_x=data_xy['data']
data_y=data_xy['labels']
shared_x = theano.shared(numpy.asarray(data_x,
dtype=theano.config.floatX),
borrow=borrow)
shared_y = theano.shared(numpy.asarray(data_y,
dtype=theano.config.floatX),
borrow=borrow)
# When storing data on the GPU it has to be stored as floats
# therefore we will store the labels as ``floatX`` as well
# (``shared_y`` does exactly that). But during our computations
# we need them as ints (we use labels as index, and if they are
# floats it doesn't make sense) therefore instead of returning
# ``shared_y`` we will have to cast it to int. This little hack
# lets ous get around this issue
return shared_x, T.cast(shared_y, 'int32')
test_set_x, test_set_y = shared_dataset(test_set)
valid_set_x, valid_set_y = shared_dataset(valid_set)
train_set_x, train_set_y = shared_dataset(train_set)
rval = [(train_set_x, train_set_y), (valid_set_x, valid_set_y),
(test_set_x, test_set_y)]
return rval
def sgd_optimization_mnist(learning_rate=0.13, n_epochs=1000,
dataset='/home/ubuntu/Desktop/ramya/cifar-10-batches-py/',
batch_size=600):
"""
Demonstrate stochastic gradient descent optimization of a log-linear
model
This is demonstrated on MNIST.
:type learning_rate: float
:param learning_rate: learning rate used (factor for the stochastic
gradient)
:type n_epochs: int
:param n_epochs: maximal number of epochs to run the optimizer
:type dataset: string
:param dataset: the path of the MNIST dataset file from
http://www.iro.umontreal.ca/~lisa/deep/data/mnist//home/ubuntu/Desktop/ramya/cifar-10-batches-py
"""
datasets_0,datasets_1,datasets_2 = load_data(dataset)
train_set_x, train_set_y = datasets_0
valid_set_x, valid_set_y = datasets_1
test_set_x, test_set_y = datasets_2
@amit4111989 I ran the code with the changes . This time I am getting some error in the train_model function. I am attaching the files of codes that I have modified and a file errors.txt that contains the error msg that i is being prompted. errors.txt
Hi Ramya, The problem is that your dataset has 1024x3 = 3072 pixels (1024 pixels for red,blue and green channel each) and your batches are just 32x32 = 1024 pixels. So in the 4d reshaping of the tensor variable
layer0_input = x.reshape((batch_size, 1, 32, 32))
you need to change depth to 3
layer0_input = x.reshape((batch_size, 3, 32, 32))
Apart from that your layer00 did not make much sense to me , so I commented it out. You also did not make adjustments to the shapes after changing pool size to (1,1) and pixels to (32x32) from (28x28).
I made all these adjustments for the layers (more info in comments), and i was able to train and save the best model.
I have attached the working code. Let me know if something comes up. I am not too familiar with image classification in CNN or otherwise, so my knowledge is pretty limited to making the code work.
I would recommend looking at this link for more image classification related information using CIFAR database with CNN http://cs231n.github.io/convolutional-networks/
Hi @amit4111989, Training is being done without any error !Thanks for that :+1: Besides I have a small doubt . we are no where including the code to save the best model so far trained in the "covolutional_mlp" file. I guess the model that is being saved is from the "logistic_sgd" .Should we change the code to save the model in "convolutional_mlp" also or the model that is being saved is a cnn ? can u just have a look at that once. Thanks once again .
I am training CIFAR10 dataset using a CNN.I am facing a problem while loading the dataset into dictionary variables. CIFAR10 dataset contains 6 batches 5 of which can be used for training and validation while one batch can be used for testing. I would like to split the 5 batches into 4 batches for training and 1 for validation.But,these batches are stored in a serialized format using Pickle .I am facing a problem while loading these 6 batches into 3 dictionaries 1 each for training,validation and testing where each dictionary contains data and labels as mentioned in the link given below(The dataset can also be downloaded from there) Link to download dataset: https://www.cs.toronto.edu/~kriz/cifar.html
Here is the code snippet which I have modified from convolutional_mlp.py file in load_data method