How to load weights of the model for a single CPU?

alishakiba commented 9 years ago

Hi there,

I have modified the code, so it can run on a single CPU.

In file basic_model.py, comment line 7

from lasagne.layers import dnn

and change lines 136-137 as follows

    Conv2DLayer = nn.layers.Conv2DLayer #dnn.Conv2DDNNLayer
    MaxPool2DLayer = nn.layers.MaxPool2DLayer #dnn.MaxPool2DDNNLayer

Then, the following code is working without any errors, however, produces a set non-related and random weights! Could you please help to load the weights?

# coding: utf-8

# In[1]:

import sys
sys.path.append('../')

import cPickle as pickle
import re
import glob
import os

import time

import theano
import theano.tensor as T
import numpy as np
import pandas as p
import lasagne as nn

from PIL import Image

from utils import hms, architecture_string, get_img_ids_from_iter

# In[2]:

get_ipython().magic(u'pylab inline')
rcParams['figure.figsize'] = 16, 6
np.set_printoptions(precision=3)
np.set_printoptions(suppress=True)

# In[3]:

dump_path = '../dumps/2015_07_17_123003_PARAMSDUMP.pkl'

# In[4]:

model_data = pickle.load(open(dump_path, 'r'))

# In[5]:

from models import basic_model as model

# In[6]:

LEARNING_RATE_SCHEDULE = model.LEARNING_RATE_SCHEDULE
prefix_train = model.prefix_train if hasattr(model, 'prefix_train') else     '/run/shm/train_ds2_crop/'
prefix_test = model.prefix_test if hasattr(model, 'prefix_test') else     '/run/shm/test_ds2_crop/'
SEED = model.SEED if hasattr(model, 'SEED') else 11111

id_train, y_train = model.id_train, model.y_train
id_valid, y_valid = model.id_valid, model.y_valid
id_train_oversample = model.id_train_oversample,
labels_train_oversample = model.labels_train_oversample

sample_coefs = model.sample_coefs if hasattr(model, 'sample_coefs')     else [0, 7, 3, 22, 25]

l_out, l_ins = model.build_model()

# In[7]:

params = nn.layers.get_all_param_values(l_out)
for p, v in zip(params, model_data):
    p = v

# In[8]:

chunk_size = 64
batch_size = 256

# In[9]:

output = nn.layers.get_output(l_out, deterministic=True)
input_ndims = [len(nn.layers.get_output_shape(l_in))
               for l_in in l_ins]
xs_shared = [nn.utils.shared_empty(dim=ndim)
             for ndim in input_ndims]

# In[10]:

import pandas as pd
OriginalLabels = pd.read_csv(r'../data/trainLabels.csv', sep=',')

# In[11]:

import glob
temp1 = np.zeros((chunk_size, 3, 512, 512), dtype='float64')
fileList = sorted(glob.glob(r'/home/ali/Desktop/kaggle_diabetic_retinopathy-master/data/64/*.tiff'))
labels64 = []
for i, f in enumerate(fileList):
    temp1[i,:,:,:] = np.array(Image.open(f)).T / 255.0
    fname = f.split('/')[-1].split('.')[0]
    lbl = OriginalLabels.loc[OriginalLabels['image'] == fname]['level'].values.item(0)
#     print lbl
    labels64.append(lbl)
xs_shared[0].set_value(temp1)
print temp1.shape

# In[12]:

temp2 = np.ones((chunk_size, 2), dtype='float64') * 512
xs_shared[1].set_value(temp2)
temp2.shape

# In[13]:

idx = T.lscalar('idx')

givens = {}
for l_in, x_shared in zip(l_ins, xs_shared):
    givens[l_in.input_var] = x_shared[idx * batch_size:(idx + 1) * batch_size]

compute_output = theano.function(
    [idx],
    output,
    givens=givens,
    on_unused_input='ignore'
)

# In[14]:

# import pandas as pd
# OriginalLabels = pd.read_csv(r'../data/trainLabels.csv', sep=',')

# In[15]:

# import glob
# temp1 = np.zeros((chunk_size, 3, 512, 512), dtype='float64')
# fileList = glob.glob(r'/home/ali/Desktop/kaggle_diabetic_retinopathy-master/data/64/*.tiff')
# labels64 = []
# for i, f in enumerate(fileList):
#     temp1[i,:,:,:] = np.array(Image.open(f)).T / 255.0
#     fname = f.split('/')[-1].split('.')[0]
#     lbl = OriginalLabels.loc[OriginalLabels['image'] == fname]['level'].values.item(0)
# #     print lbl
#     labels64.append(lbl)
# xs_shared[0].set_value(temp1)
# print temp1.shape

# In[16]:

# temp2 = np.ones((chunk_size, 2), dtype='float64') * 512
# xs_shared[1].set_value(temp2)
# temp2.shape

# In[17]:

get_ipython().magic(u'time predictions = compute_output(0)')

# In[18]:

results = []
for i, p in enumerate(predictions):
    resul = dict()
    resul['fileName'] = fileList[i].split('/')[-1].split('.')[0]
    resul['original'] = labels64[i]
    resul['pred'] = p
    results.append(resul)

# In[19]:

for i in results:
    print i

# In[20]:

len(model_data)

JeffreyDF commented 9 years ago

Hi alishakiba,

I'll try to help you more tomorrow but I can already quickly say you should try loading and setting the trained parameters as follows:

loaded_params = pickle.load(open('params.pkl', 'rb'))
all_params = nn.layers.get_all_params(l_out)

for i, v in enumerate(loaded_params):
    all_params[i].set_value(v)

Let me know if that already helps.

Jeffrey

alishakiba commented 9 years ago

Hi Jeffrey,

Thanks for the help. This command does not work, however I was able to load the parameters by

model_data = pickle.load(open(dump_path, 'r'))
from models import basic_model as model
LEARNING_RATE_SCHEDULE = model.LEARNING_RATE_SCHEDULE
prefix_train = model.prefix_train if hasattr(model, 'prefix_train') else \
    '/run/shm/train_ds2_crop/'
prefix_test = model.prefix_test if hasattr(model, 'prefix_test') else \
    '/run/shm/test_ds2_crop/'
SEED = model.SEED if hasattr(model, 'SEED') else 11111

id_train, y_train = model.id_train, model.y_train
id_valid, y_valid = model.id_valid, model.y_valid
id_train_oversample = model.id_train_oversample,
labels_train_oversample = model.labels_train_oversample

sample_coefs = model.sample_coefs if hasattr(model, 'sample_coefs') \
    else [0, 7, 3, 22, 25]

l_out, l_ins = model.build_model()

nn.layers.set_all_param_values(l_out, model_data)

I was also able to load the data in the shared memory, using the following approach, but could you please explain the exact input of the network?

chunk_size = 64
batch_size = 128

output = nn.layers.get_output(l_out, deterministic=True)
input_ndims = [len(nn.layers.get_output_shape(l_in))
               for l_in in l_ins]
xs_shared = [nn.utils.shared_empty(dim=ndim)
             for ndim in input_ndims]

import pandas as pd
OriginalLabels = pd.read_csv(r'../data/trainLabels.csv', sep=',')

import glob
temp1 = np.zeros((chunk_size, 3, 512, 512), dtype='float64')
fileList =
sorted(glob.glob(r'/home/ali/Desktop/kaggle_diabetic_retinopathy-master/data/64/*.tiff'))
labels64 = []
for i, f in enumerate(fileList):
    temp1[i,:,:,:] = np.array(Image.open(f)).T / 255.0
    fname = f.split('/')[-1].split('.')[0]
    lbl = OriginalLabels.loc[OriginalLabels['image'] ==
fname]['level'].values.item(0)
#     print lbl
    labels64.append(lbl)
xs_shared[0].set_value(temp1)
print temp1.shape

temp2 = np.ones((chunk_size, 2), dtype='float64') * 512
xs_shared[1].set_value(temp2)
temp2.shape

idx = T.lscalar('idx')

givens = {}
for l_in, x_shared in zip(l_ins, xs_shared):
    givens[l_in.input_var] = x_shared[idx * batch_size:(idx + 1) *
batch_size]

compute_output = theano.function(
    [idx],
    output,
    givens=givens,
    on_unused_input='ignore'
)
print 'Done'

%time predictions = compute_output(0)

However, I get an error of summing two pictures, I think that's when the two pictures are merged.

Thanks again.

JeffreyDF commented 9 years ago

Until I have more time, could you maybe try merging your changes with my original notebook? I.e., such that it uses the CPU but everything else should still work. You will need to change some things about how it "finds" the images to iterate over etc. But that way it will be easier for me to help you quickly.

I'll try to take a closer look tomorrow.

alishakiba commented 9 years ago

Thanks again. I'll take care of that within a couple of hours.

alishakiba commented 9 years ago

Dear Jeffrey

I have written my own code which I think it should work for single CPU. The code is places in (https://github.com/alishakiba/kaggle_diabetic_retinopathy/blob/master/notebooks/PredictDRD.ipynb). However, there is a problem on block 16, it takes hours of hours of CPU and I have not seen it finishing its job.

To be able to create the model, I was forced to modify some lines in the basic_model.py file which are marked in (https://github.com/alishakiba/kaggle_diabetic_retinopathy/commit/e7750f23a1d8d052ce1c897e620e46aaf5b1a3f6).

I have also tested your code on a GPU of GeForce GT 730, however, I was unsuccessful because of no support for cudnn. The same notebook as in (https://github.com/alishakiba/kaggle_diabetic_retinopathy/commit/e7750f23a1d8d052ce1c897e620e46aaf5b1a3f6) with the same code on a GPU enable machine hangs on block 8, with dimension mismatch (where I am setting the weights to the model!)

ValueError: mismatch: parameter has shape (32, 128, 128) but the value to set has shape (32, 127 ,127)

JeffreyDF commented 9 years ago

I'll just try to make a quick notebook to do it on the CPU, it'll be the easiest, I think. I'm working on it now and will try to finish it today. Otherwise, please remind me if I don't update it before the end of the week.

alishakiba commented 9 years ago

Thanks Jeffrey. Besides, I am learning the neural networks (I've read the Nielsens' book and some other stuff around the web). Which model of GPU do you suggest to use? (I have currently a GT 430 and a GT 730 graphic cards, but none of them supports to run cudnn).

JeffreyDF commented 9 years ago

By the way, the error you are getting is most likely because of this change you made.

Same convolution support for the normal conv layers was probably added later (and you are probably using an older version of Lasagne).

JeffreyDF commented 9 years ago

Thanks Jeffrey. Besides, I am learning the neural networks (I've read the Nielsens' book and some other stuff around the web). Which model of GPU do you suggest to use? (I have currently a GT 430 and a GT 730 graphic cards, but none of them supports to run cudnn).

It depends on your budget. The GTX 980 (normal or Ti) is very good for the money (about 350 pounds for the non-Ti version). In any case, try to get a Maxwell GPU since they are the most recent ones and are generally much faster than the older generation. But I can't confidently say much about the less expensive ones (I have heard about the 970 having some issues when you try to use a lot of video memory).

A very good alternative is to use the GPU instances on Amazon AWS. It is roughly 2-3 times slower than a 980 but still pretty good (and supports cudnn).

alishakiba commented 9 years ago

By the way, the error you are getting is most likely because of this change you made.

Same convolution support for the normal conv layers was probably later (and you are probably using an older version of Lasagne).

Thanks. I have updated my lasagne to the latest version and uncommented the line. The NotImplemented error has gone but there is another :(((

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-6-598fada8a14d> in <module>()
     11 sample_coefs = model.sample_coefs if hasattr(model, 'sample_coefs')     else [0, 7, 3, 22, 25]
     12 
---> 13 l_out, l_ins = model.build_model()

/home/ali/Desktop/DRD/fifth/models/basic_model.py in build_model()
    143                          nonlinearity=LeakyRectify(leakiness),
    144                          W=nn.init.Orthogonal(1.0), b=nn.init.Constant(0.1),
--> 145                          untie_biases=True)
    146     layers.append(l_conv)
    147 

/opt/anaconda/lib/python2.7/site-packages/Lasagne-0.2.dev1-py2.7.egg/lasagne/layers/conv.pyc in __init__(self, incoming, num_filters, filter_size, stride, pad, untie_biases, W, b, nonlinearity, convolution, **kwargs)
    389                  nonlinearity=nonlinearities.rectify,
    390                  convolution=T.nnet.conv2d, **kwargs):
--> 391         super(Conv2DLayer, self).__init__(incoming, **kwargs)
    392         if nonlinearity is None:
    393             self.nonlinearity = nonlinearities.identity

TypeError: __init__() got an unexpected keyword argument 'border_mode'

I am using the latest version of Theano and Lasagne.

It depends on your budget. The GTX 980 (normal or Ti) is very good for the money (about 350 pounds for the non-Ti version). In any case, try to get a Maxwell GPU since they are the most recent ones and are generally much faster than the older generation. But I can't confidently say much about the less expensive ones (I have heard about the 970 having some issues when you try to use a lot of video memory).

A very good alternative is to use the GPU instances on Amazon AWS. It is roughly 2-3 times slower than a 980 but still pretty good (and supports cudnn).

Thank you. I will try to ask for a GTX 980 one :))).

JeffreyDF commented 9 years ago

Yes, the border_mode argument is gone. You need to replace it with

pad='same'

I'm trying this now but the compilation time is taking ages or something went wrong. I'll check later.

JeffreyDF commented 9 years ago

It is stuck on compiling the compute_output function for me (when using the normal non-cudnn Conv2DLayers). I'll try to have another look this weekend. Please let me know if you got it to work in the meantime.

alishakiba commented 9 years ago

That's the same for me. I've run it for a night, from 11:30pm to 06:00am and it was not finished :((. Would you mind explain a little bit about the input to the network? Maybe I can then implement the network with a GPU package which won't use cudnn.

JeffreyDF commented 9 years ago

It might be some bug with Theano. Compiling the graph for the first time can take a while (like 10 minutes, maybe), but it shouldn't take much longer. I've tested it with the cudnn layers and they work fine.

Which input do you mean? The input images to the network as provided by the generators?

The generators can do a lot of transformations during training, see here. You can remove all that if you just want to test or get rid of all the extra code this brings with it. Just do the resizing and normalising to get somewhat decent results.

If you plan on using some other package to try it, keep in mind that parameters from my dump might be saved in another "format" than the one another package uses.

JeffreyDF / kaggle_diabetic_retinopathy

How to load weights of the model for a single CPU? #2