preddy5 / segnet

A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
http://preddy5.github.io/2016/03/08/segnet-post.html
205 stars 106 forks source link

Error allocating memory #9

Open aneesh3108 opened 7 years ago

aneesh3108 commented 7 years ago

Hello,

When I run this code, it gives me the following error:

MemoryError: Error allocating 442368000 bytes of device memory (out of memory). Apply node that caused the error: GpuElemwise{sub,no_inplace}(GpuElemwise{add,no_inplace}.0, GpuElemwise{Composite{(((i0 / i1) / i2) / i3)},no_inplace}.0) Toposort index: 366 Inputs types: [CudaNdarrayType(float32, 4D), CudaNdarrayType(float32, (True, True, True, False))] Inputs shapes: [(10, 64, 360, 480), (1, 1, 1, 480)] Inputs strides: [(11059200, 172800, 480, 1), (0, 0, 0, 1)] Inputs values: ['not shown', 'not shown'] Outputs clients: [[GpuElemwise{sqr,no_inplace}(GpuElemwise{sub,no_inplace}.0), GpuElemwise{mul,no_inplace}(CudaNdarrayConstant{[[[[ 2.]]]]}, GpuElemwise{Composite{(((i0 / i1) / i2) / i3)},no_inplace}.0, GpuElemwise{sub,no_inplace}.0), GpuElemwise{mul,no_inplace}(GpuElemwise{Composite{(((i0 / i1) / i2) / i3)},no_inplace}.0, GpuElemwise{sub,no_inplace}.0)]]

Tried using 660Ti - went upto 980 and 1080. Doesn't seem to go.

Any solutions???

Also, does this warning have anything to do with it ??

UserWarning: Model inputs must come from a Keras Input layer, they cannot be the output of a previous non-Input layer. Here, a tensor specified as input to "sequential_11_model" was not an Input tensor, it was generated by layer layer_10. Note that input tensors are instantiated via tensor = Input(shape). The tensor that caused the issue was: layer_input_10 str(x.name))

preddy5 commented 7 years ago

@aneesh3108 Decrease Batch_size, let me know if it works

aneesh3108 commented 7 years ago

@pradyu1993 Decreased batch size to 10 - and it barely runs fast enough (quite surprising though! )

Now facing a new problem where my loss is showing as 'nan'.

There are a couple of suggestions out there that reason out why this is happening - but if you have any context on the cause, do let me know.

(Update: Currently solved by commenting out all batch normalization layers)

New question: My loss starts at 14.something (with the exact same code as yours and the exact same config.) Did you do any tweaking ? image

Thanks,

Aneesh