kahst / BirdCLEF2017

Source code of the TUCMI submission to BirdCLEF2017
MIT License
40 stars 13 forks source link

Error in training on own data #5

Open divsidhu-26 opened 6 years ago

divsidhu-26 commented 6 years ago

I got the following error on trying to train with my own data. I followed the instructions in README.md . Can you explain what is happening?

Traceback (most recent call last): File "birdCLEF_train.py", line 791, in loss = train_net(image_batch, target_batch, lr) File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 871, in call storage_map=getattr(self.fn, 'storage_map', None)) File "/usr/local/lib/python2.7/dist-packages/theano/gof/link.py", line 314, in raise_with_op reraise(exc_type, exc_value, exc_trace) File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 859, in call outputs = self.fn() MemoryError: Apply node that caused the error: Elemwise{sqr,no_inplace}(Elemwise{Sub}[(0, 0)].0) Toposort index: 156 Inputs types: [TensorType(float64, 4D)] Inputs shapes: [(128, 128, 64, 128)] Inputs strides: [(8388608, 65536, 1024, 8)] Inputs values: ['not shown'] Outputs clients: [[Sum{axis=[0, 2, 3], acc_dtype=float64}(Elemwise{sqr,no_inplace}.0)]]

kahst commented 6 years ago

This is a memory error, typically raised when the GPU runs out of memory. There are basically two things you can do: Reduce the Batch Size or reduce the net complexity (less filters). You can also reduce the image resolution, but that requires more adjustments and might not be as applicable as the other two strategies.