sveitser / kaggle_diabetic

2nd place solution for the Kaggle Diabetic Retinopathy Detection Challenge
MIT License
247 stars 121 forks source link

Error when using CPU mode #3

Closed lincoln2010 closed 8 years ago

lincoln2010 commented 8 years ago

when using CPU mode, an error occured in layers.py, line 81, in init del sel.,mode AtttributeError: mode occured

sveitser commented 8 years ago

I'd expect it to take way too much time to train these networks on a CPU. But I can check if I can get it to run if this is of value to you.

lincoln2010 commented 8 years ago

Thank you very much! I am now trying to fix it by annotating that line, while still get errors : lasagene/layers/conv.py, raise NotImplementedError Strided convolution with border_mode 'same' is not supported by this layer yet. Maybe there are some diffrences between the GPU and CPU mode : ) To test it might not need quite a time, if it runs successfully, the results should be similar.

alishakiba commented 8 years ago

@sveitser Thanks. In fact, I have reduced the train and test data to a stratified sample of 500 images in each. This is in fact my first experience with convolutional neural networks and I am trying to broaden my knowledge by considering your code. Thanks again.

sveitser commented 8 years ago

The code in branch https://github.com/sveitser/kaggle_diabetic/tree/deterministic now runs on the CPU for me. Feel free to try it out. Note that the lasagne and nolearn version requirements are different than in the master branch (see https://github.com/sveitser/kaggle_diabetic/blob/deterministic/requirements.txt).

FYI I didn't get through a single epoch with 128 pixel images in an hour so I'm not sure how useful this is.

lincoln2010 commented 8 years ago

@sveitser Thank you very much! I'll try it, if it works, I will send a message. PS: Really amazed by your work!

alishakiba commented 8 years ago

@sveitser Thank you very much. However, I'm afraid that still I cannot run this. I've uninstalled the packages theaon, lasagne and nolearn and installed the new ones. However, while training, I encounter this error:

train_nn.py --cnf configs/c_128_5x5_32.py
using CPU
{'aug_params': {'allow_stretch': True,
                'do_flip': True,
                'rotation_range': (0, 360),
                'shear_range': (0, 0),
                'translation_range': (-40, 40),
                'zoom_range': (0.8695652173913044, 1.15)},
 'balance_ratio': 0.975,
 'balance_weights': array([  1.36094537,  14.3782235 ,   6.63756614,  40.23596793,  49.61299435]),
 'batch_size_test': 128,
 'batch_size_train': 128,
 'final_balance_weights': array([ 1.,  2.,  2.,  2.,  2.]),
 'h': 112,
 'name': 'c_128_5x5_32',
 'schedule': {0: 0.003, 150: 0.0003, 201: 'stop'},
 'sigma': 0.5,
 'test_dir': 'data/test_tiny',
 'train_dir': 'data/train_tiny',
 'w': 112,
 'weight_decay': 0.0005}
/home/ali/DiabeticRethinopathy/diabeticrethinopathy/dr2/src/nolearn-master/nolearn/lasagne/base.py:185: UserWarning: The 'train_test_split' method has been deprecated, please use the 'train_split' parameter instead.
  warn("The 'train_test_split' method has been deprecated, please "
/home/ali/DiabeticRethinopathy/diabeticrethinopathy/dr2/src/lasagne/lasagne/init.py:87: UserWarning: The uniform initializer no longer uses Glorot et al.'s approach to determine the bounds, but defaults to the range (-0.01, 0.01) instead. Please use the new GlorotUniform initializer to get the old behavior. GlorotUniform is now the default for all layers.
  warnings.warn("The uniform initializer no longer uses Glorot et al.'s "
/home/ali/DiabeticRethinopathy/diabeticrethinopathy/dr2/src/lasagne/lasagne/layers/helper.py:69: UserWarning: get_all_layers() has been changed to return layers in topological order. The former implementation is still available as get_all_layers_old(), but will be removed before the first release of Lasagne. To ignore this warning, use `warnings.filterwarnings('ignore', '.*topo.*')`.
  warnings.warn("get_all_layers() has been changed to return layers in "
Traceback (most recent call last):
  File "/home/ali/DiabeticRethinopathy/diabeticrethinopathy/train_nn.py", line 41, in <module>
    main()
  File "/home/ali/DiabeticRethinopathy/diabeticrethinopathy/dr2/local/lib/python2.7/site-packages/click/core.py", line 610, in __call__
    return self.main(*args, **kwargs)
  File "/home/ali/DiabeticRethinopathy/diabeticrethinopathy/dr2/local/lib/python2.7/site-packages/click/core.py", line 590, in main
    rv = self.invoke(ctx)
  File "/home/ali/DiabeticRethinopathy/diabeticrethinopathy/dr2/local/lib/python2.7/site-packages/click/core.py", line 782, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/ali/DiabeticRethinopathy/diabeticrethinopathy/dr2/local/lib/python2.7/site-packages/click/core.py", line 416, in invoke
    return callback(*args, **kwargs)
  File "/home/ali/DiabeticRethinopathy/diabeticrethinopathy/train_nn.py", line 32, in main
    net.load_params_from(weights_from)
  File "/home/ali/DiabeticRethinopathy/diabeticrethinopathy/dr2/src/nolearn-master/nolearn/lasagne/base.py", line 552, in load_params_from
    self.initialize()
  File "/home/ali/DiabeticRethinopathy/diabeticrethinopathy/nn.py", line 144, in initialize
    self.y_tensor_type,
  File "/home/ali/DiabeticRethinopathy/diabeticrethinopathy/nn.py", line 157, in _create_iter_funcs
    layers, target=y_batch, **objective_kw)
  File "/home/ali/DiabeticRethinopathy/diabeticrethinopathy/nn.py", line 63, in objective
    output_layer, deterministic=deterministic, **get_output_kw)
  File "/home/ali/DiabeticRethinopathy/diabeticrethinopathy/dr2/src/lasagne/lasagne/layers/helper.py", line 228, in get_output
    all_outputs[layer] = layer.get_output_for(layer_inputs, **kwargs)
  File "/home/ali/DiabeticRethinopathy/diabeticrethinopathy/dr2/src/lasagne/lasagne/layers/pool.py", line 235, in get_output_for
    mode=self.mode,
TypeError: max_pool_2d() got an unexpected keyword argument 'mode'
alishakiba commented 8 years ago

@sveitser It seems that it is solved. I have no idea what happened, but seems that my theano was not updated :( . Probably I have forgotten to include --upgrade switch in pip. It is solved right now. Sorry for bothering and taking your time. Hope you the best.

lincoln2010 commented 8 years ago

@alishakiba Well it seems that there are still some errors, could you tell me how did you make it right?

/DR/layers.py:101: UserWarning: max_pool_2d() will have the parameter ignore_border default value changed to True (currently False). To have consistent behavior with all Theano version, explicitly add the parameter ignore_border=True. (this is also faster than ignore_border=False) mode='average_inc_pad') couldn't load weights starting from scratch fitting ... Traceback (most recent call last): File "train_nn.py", line 41, in main() File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 610, in call return self.main(_args, _kwargs) File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 590, in main rv = self.invoke(ctx) File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 782, in invoke return ctx.invoke(self.callback, _ctx.params) File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 416, in invoke return callback(_args, **kwargs) File "train_nn.py", line 38, in main net.fit(files, labels) File "/home/lincoln/downloads/kaggle_diabetic-deterministic/src/nolearn-master/nolearn/lasagne/base.py", line 416, in fit self.train_loop(X, y) File "/home/lincoln/desktop/DR/nn.py", line 219, in train_loop X, y, self.eval_size) File "/home/lincoln/desktop/DR/nn.py", line 126, in train_test_split X, y, test_size=eval_size) File "/home/lincoln/desktop/DR/data.py", line 292, in split train, test = split_indices(files, labels, test_size, random_state) File "/home/lincoln/desktop/DR/data.py", line 280, in split_indices labels = get_labels(names, per_patient=True) File "/home/lincoln/desktop/DR/data.py", line 222, in get_labels return np.vstack([labels[left], labels[~left]]).T File "/usr/local/lib/python2.7/dist-packages/numpy/core/shape_base.py", line 228, in vstack return _nx.concatenate([atleast_2d(_m) for _m in tup], 0) ValueError: all the input array dimensions except for the concatenation axis must match exactly

alishakiba commented 8 years ago

@lincoln2010 Sorry for late response. I have set the system to run the model, the 5x5 one just posted warnings and the training was going well, although the kappa metric was 0 even in the round 100. Because I am currently out of town for some days, I have not checked their progress, maybe there are the same errors or more. I'll be in the lab the day after tomorrow and check their progress. There might be the same errors.

hiimivantang commented 8 years ago

@alishakiba Hi Ali, I have the same issue as well:

Traceback (most recent call last):
  File "train_nn.py", line 41, in <module>
    main()
  File "/home/ivantang/VirtualEnvs/convnet/lib/python2.7/site-packages/click/core.py", line 610, in __call__
    return self.main(*args, **kwargs)
  File "/home/ivantang/VirtualEnvs/convnet/lib/python2.7/site-packages/click/core.py", line 590, in main
    rv = self.invoke(ctx)
  File "/home/ivantang/VirtualEnvs/convnet/lib/python2.7/site-packages/click/core.py", line 782, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/ivantang/VirtualEnvs/convnet/lib/python2.7/site-packages/click/core.py", line 416, in invoke
    return callback(*args, **kwargs)
  File "train_nn.py", line 32, in main
    net.load_params_from(weights_from)
  File "/home/ivantang/VirtualEnvs/convnet/src/nolearn-master/nolearn/lasagne/base.py", line 552, in load_params_from
    self.initialize()
  File "/home/ivantang/kaggle_diabetic-deterministic/nn.py", line 144, in initialize
    self.y_tensor_type,
  File "/home/ivantang/kaggle_diabetic-deterministic/nn.py", line 157, in _create_iter_funcs
    layers, target=y_batch, **objective_kw)
  File "/home/ivantang/kaggle_diabetic-deterministic/nn.py", line 63, in objective
    output_layer, deterministic=deterministic, **get_output_kw)
  File "/home/ivantang/VirtualEnvs/convnet/src/lasagne/lasagne/layers/helper.py", line 228, in get_output
    all_outputs[layer] = layer.get_output_for(layer_inputs, **kwargs)
  File "/home/ivantang/VirtualEnvs/convnet/src/lasagne/lasagne/layers/pool.py", line 235, in get_output_for
    mode=self.mode,
TypeError: max_pool_2d() got an unexpected keyword argument 'mode'

Even though I'm using the same versions of the dependencies as stated in this branch's requirement.txt, i'm still having the issue.

Do you mind sharing what versions of Theano, Lasagne, nolearn and pylearn2 that you are using??

Thanks!

lincoln2010 commented 8 years ago

@alishakiba Thank you very much! I have found the problem. There are must be some images missing during the download. I write a script to clean the data by eliminating the images which have only the left or the right. Sorry for bothering you!

sveitser commented 8 years ago

Great. Closing.