awentzonline / image-analogies

Generate image analogies using neural matching and blending.
MIT License
3.52k stars 282 forks source link

tensorflow termination; what(): std::bad_alloc #28

Open dadatawajue opened 8 years ago

dadatawajue commented 8 years ago

Hi, I don't have nvidia so I've been trying to use TensorFlow backend with --mrf-w=0 to speed things up (Theano works but really slow), but I get this error (tested with many different images that all worked using Theano backend).. any ideas how to fix it?

xxx:~/Code/python/neural-image-analogies$ make_image_analogy.py images/1.jpg images/1.jpg images/2.jpg out/arch --mrf-w=0
Using TensorFlow backend.
Using PatchMatch model
Scale factor 0.25 "A" shape (1, 3, 603, 653) "B" shape (1, 3, 300, 225)
Building loss...
Precomputing static features...
Building and combining losses...
Start of iteration 0 x 0
Current loss value: 62929842176.0
Image saved as out/arch_at_iteration_0_0.png
Iteration completed in 1359.27 seconds
Start of iteration 0 x 1
Current loss value: 59368124416.0
Image saved as out/arch_at_iteration_0_1.png
Iteration completed in 1354.37 seconds
Start of iteration 0 x 2
Current loss value: 58041049088.0
Image saved as out/arch_at_iteration_0_2.png
Iteration completed in 1315.46 seconds
Start of iteration 0 x 3
Current loss value: 57320632320.0
Image saved as out/arch_at_iteration_0_3.png
Iteration completed in 1324.93 seconds
Start of iteration 0 x 4
Current loss value: 56854339584.0
Image saved as out/arch_at_iteration_0_4.png
Iteration completed in 990.21 seconds
/home/xxx/Code/python/neural-image-analogies/venv/local/lib/python2.7/site-packages/scipy/ndimage/interpolation.py:573: UserWarning: From scipy 0.13.0, the output shape of zoom() is calculated with round() instead of int() - for these inputs the size of the returned array has changed.
  "the returned array has changed.", UserWarning)
Scale factor 0.625 "A" shape (1, 3, 1508, 1633) "B" shape (1, 3, 751, 563)
Building loss...
Precomputing static features...
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Aborted (core dumped)
sdierauf commented 8 years ago

That looks like it's running out of memory when trying to initialize the larger image, try scaling down the size of your source images by 50% and see if it still works. How much memory does your system have?

chenyuqing commented 7 years ago

I have the same problem and my ram is 5.55GB

timchan@ubuntu:~/workspaces/dl/tf$ python3 full_code.py Extracting MNIST_data/train-images-idx3-ubyte.gz Extracting MNIST_data/train-labels-idx1-ubyte.gz Extracting MNIST_data/t10k-images-idx3-ubyte.gz Extracting MNIST_data/t10k-labels-idx1-ubyte.gz terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc Aborted (core dumped)

fjcamillo commented 7 years ago

Having the same problem with (100, 512, 512, 3). Will try scaling the image down but are their any other work around here?

AyushKaul commented 7 years ago

Try with lesser hidden layers if you are using a fully connected NN.

tamizharasank commented 6 years ago

same issue

terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc Aborted (core dumped)

a7month commented 5 years ago

same issue when use tensorflow-jni in java application

tensorflow version is 1.2,the java crash log shows that the last stack in ###Java_org_tensorflow_TensorFlow_registeredOpList###

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f84bdcac7c9, pid=4315, tid=140204879898368
#
# JRE version: Java(TM) SE Runtime Environment (8.0_74-b02) (build 1.8.0_74-b02)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.74-b02 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C  [libtensorflow_jni4459560773440445764.so+0x1d017c9]

siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x0000000000000000

Registers:
RAX=0x0000000000000000, RBX=0x00007f8277200000, RCX=0x00007f8277200001, RDX=0x000000000000008f
RSP=0x00007f83fe0f8ca0, RBP=0x00007f83fe0f8ca0, RSI=0x00007f8277200000, RDI=0x0000000000000000
R8 =0x00000000000003b5, R9 =0x00007f83f9800000, R10=0x00007f83d3a52160, R11=0x00007f83f980b7c8
R12=0x000000000281ee60, R13=0x00007f83fe0f9090, R14=0x00007f83fe0f8f50, R15=0x00007f852c93d640
RIP=0x00007f84bdcac7c9, EFLAGS=0x0000000000010206, CSGSFS=0x000000000000e033, ERR=0x0000000000000006
  TRAPNO=0x000000000000000e

Instructions: (pc=0x00007f84bdcac7c9)
0x00007f84bdcac7a9:   89 45 f0 eb 1d 48 8b 45 f8 48 8d 50 01 48 89 55
0x00007f84bdcac7b9:   f8 48 8b 55 f0 48 8d 4a 01 48 89 4d f0 0f b6 12
0x00007f84bdcac7c9:   88 10 48 8b 45 d8 48 8d 50 ff 48 89 55 d8 48 85
0x00007f84bdcac7d9:   c0 75 d2 48 8b 45 e8 5d c3 66 2e 0f 1f 84 00 00

Register to memory mapping:

RAX=0x0000000000000000 is an unknown value
RBX=0x00007f8277200000 is an unknown value
RCX=0x00007f8277200001 is an unknown value
RDX=0x000000000000008f is an unknown value
RSP=0x00007f83fe0f8ca0 is pointing into the stack for thread: 0x00007f84ca75a000
RBP=0x00007f83fe0f8ca0 is pointing into the stack for thread: 0x00007f84ca75a000
RSI=0x00007f8277200000 is an unknown value
RDI=0x0000000000000000 is an unknown value
R8 =0x00000000000003b5 is an unknown value
R9 =0x00007f83f9800000 is an unknown value
R10=0x00007f83d3a52160 is an unknown value
R11=0x00007f83f980b7c8 is an unknown value
R12=0x000000000281ee60 is an unknown value
R13=0x00007f83fe0f9090 is pointing into the stack for thread: 0x00007f84ca75a000
R14=0x00007f83fe0f8f50 is pointing into the stack for thread: 0x00007f84ca75a000
R15=0x00007f852c93d640: pthread_key_create+0 in /lib64/libpthread.so.0 at 0x00007f852c931000

Stack: [0x00007f83fdffe000,0x00007f83fe0ff000],  sp=0x00007f83fe0f8ca0,  free space=1003k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [libtensorflow_jni4459560773440445764.so+0x1d017c9]
C  [libtensorflow_jni4459560773440445764.so+0x1a0c46e]
C  [libtensorflow_jni4459560773440445764.so+0x18bc6da]
C  [libtensorflow_jni4459560773440445764.so+0x18910bb]
C  [libtensorflow_jni4459560773440445764.so+0x18b9ae9]
C  [libtensorflow_jni4459560773440445764.so+0x18842ef]
C  [libtensorflow_jni4459560773440445764.so+0x1885143]
C  [libtensorflow_jni4459560773440445764.so+0x20bd47]
C  [libtensorflow_jni4459560773440445764.so+0x20bf52]
C  [libtensorflow_jni4459560773440445764.so+0x201783]  Java_org_tensorflow_Session_run+0x3f3
j  org.tensorflow.Session.run(J[B[J[J[I[J[I[JZ[J)[B+0
j  org.tensorflow.Session.access$100(J[B[J[J[I[J[I[JZ[J)[B+17
j  org.tensorflow.Session$Runner.runHelper(Z)Lorg/tensorflow/Session$Run;+336
j  org.tensorflow.Session$Runner.run()Ljava/util/List;+2
objdump -C -d --start-address=0x1a0c46e libtensorflow_jni4459560773440445764.so | egrep '>:$' -m 1
0000000001a0c46e <Java_org_tensorflow_TensorFlow_registeredOpList+0x18094ee>: