IBM / MAX-Image-Resolution-Enhancer

Upscale an image by a factor of 4, while generating photo-realistic details.
https://developer.ibm.com/exchanges/models/all/max-image-resolution-enhancer/
Apache License 2.0
984 stars 155 forks source link

Terminate called after throwing an instance of 'std::bad_alloc' #28

Closed nghiaht closed 4 years ago

nghiaht commented 4 years ago

Hi, I'm using Docker to build the repo and expose via port 5000. I used the samples/test_examples/low_resolution/astronaut.png as a test image, POST to /model/predict.

Then the Docker container is stopped, showing following logs:

2020-04-22 10:10:28.347782: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 450785280 exceeds 10% of system memory.
2020-04-22 10:10:30.802358: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 450785280 exceeds 10% of system memory.
2020-04-22 10:10:31.210874: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 450785280 exceeds 10% of system memory.
2020-04-22 10:10:31.210834: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 450785280 exceeds 10% of system memory.
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Aborted (core dumped)

My memory info

free -m
              total        used        free      shared  buff/cache   available
Mem:           2000         301        1228           0         470        1533
Swap:          1952         165        1787

I researched on Google about setting Tensorflow logging level which I have tried and get no better results, or adjusting the batch size (but I don't know if we can adjust it in the max-image-resolution-enhancer).

Please give me suggestions. Thank for your help!

xuhdev commented 4 years ago

Looks like you have used up all your memories: Allocation of 450785280 exceeds 10% of system memory. Perhaps @feihugis knows how to reduce memory usage in tensorflow?

feihugis commented 4 years ago

@nghiaht As said here, 2GB memory will be not big enough for this model. Could you try increasing the docker memory as here?

nghiaht commented 4 years ago

@nghiaht As said here, 2GB memory will be not big enough for this model. Could you try increasing the docker memory as here?

Thanks for you reply. I knew that By default, a container has no resource constraints and can use as much of a given resource as the host’s kernel scheduler allows. I did increase memory from 2GB to 4GB and the endpoint could be used, less crashes than 2GB.

And I also reverted to 2GB, try to adjust Tensorflow's settings:

max-image-resolution-enhancer/core/srgan_controller.py - Line 63

config = tf.ConfigProto()
config.gpu_options.allow_growth = True
config.inter_op_parallelism_threads=1
config.intra_op_parallelism_threads=1

The defaults are 4 and 5, I just googled around and adjust it, luckily the endpoint can work and allow to resize image without crashes but slowly.

To summarize:

feihugis commented 4 years ago

@nghiaht Thanks for letting us know your solution. Glad the issue is resolved.