lukalabs / cakechat

CakeChat: Emotional Generative Dialog System
Apache License 2.0
1.7k stars 933 forks source link

Tensorflow GPU issue #66

Closed 4R7I5T closed 5 years ago

4R7I5T commented 5 years ago

CLIENT:

root@c6bf55d8c25f:~/cakechat# python tools/test_api.py -f localhost -p 8080 -c "hi!" -c "hi, how are you?" -c "good!" -e "joy"

Output:

Using TensorFlow backend.
{'message': 'Can\'t process request: OOM when allocating tensor with shape[10,39,50000] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc\n\t [[{{node decoder_model/softmax_with_temperature/Softmax}} = Softmax[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](decoder_model/softmax_with_temperature/sub)]]\nHint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.\n\n\t [[{{node decoder_model/softmax_with_temperature/Softmax/_203}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_715_decoder_model/softmax_with_temperature/Softmax", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]\nHint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.\n'}

SERVER output:

2019-08-16 12:03:08.905231: W tensorflow/core/common_runtime/bfc_allocator.cc:271] *************************************************************************************************xxx
2019-08-16 12:03:08.905317: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at softmax_op_gpu.cu.cc:158 : Resource exhausted: OOM when allocating tensor with shape[10,39,50000] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[16.08.2019 12:03:08.906][ERROR][1][cakechat.api.v1.server][5] Can't process request: OOM when allocating tensor with shape[10,39,50000] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
     [[{{node decoder_model/softmax_with_temperature/Softmax}} = Softmax[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](decoder_model/softmax_with_temperature/sub)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

     [[{{node decoder_model/softmax_with_temperature/Softmax/_203}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_715_decoder_model/softmax_with_temperature/Softmax", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

127.0.0.1 - - [16/Aug/2019 12:03:08] "POST /cakechat_api/v1/actions/get_response HTTP/1.1" 500 -
nicolas-ivanov commented 5 years ago

OOM stands for "Out of memory". What is the size of RAM on your GPU?

In order to decrease memory usage you may set lower params OUTPUT_SEQUENCE_LENGTH and SAMPLES_NUM_FOR_RERANKING in config file.

4R7I5T commented 5 years ago

This device has 40GB of RAM and 24GB GDDR5

stale[bot] commented 5 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

nicolas-ivanov commented 5 years ago

@4R7I5T Did you try reducing the params values, as suggested above?

stale[bot] commented 5 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

bl1mp commented 5 years ago

make sure you also put content caching off in your OS and keep it well contained if on OS you might wanna take a look at using metal to handle gpu MTLStorageMode.managed The CPU and GPU may maintain separate copies of the resource, and any changes must be explicitly synchronized. SDKs

In macOS, this is the default storage mode for MTLTexture objects. In iOS and tvOS, the managed storage mode is not available. You explicitly decide when to synchronize changes between the CPU and GPU. If you use the CPU to change the contents of a resource, you must use one or more of the methods provided by the MTLBuffer or MTLTexture protocols to copy the changes to the GPU. If you use the GPU to change the contents of a resource, you must encode a blit pass to copy the changes to the CPU. See the MTLBlitCommandEncoder protocol.

case shared The resource is stored in system memory and is accessible to both the CPU and the GPU. case private The resource can be accessed only by the GPU. case memoryless The resource’s contents can be accessed only by the GPU and only exist temporarily during a render pass." Screenshot 2019-09-09 at 16 03 23

stale[bot] commented 5 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.