peterlee0127 / tensorflow-nvJetson

TensorFlow for NVIDIA Jetson, also include patch and script for building.
https://tfjetson.peterlee.app
205 stars 61 forks source link

The GPU freeMemory shown here is too low #25

Closed haiyang-tju closed 6 years ago

haiyang-tju commented 6 years ago

I successfully compiled TF-1.9.0 followed to your steps. Also got the correct verification. Thanks for your code there.

But, when running the RT optimization graph, there is too low freeMemory left, and I often make mistakes while running. Also on your test_tftrt.py output logger:

2018-04-02 11:25:20.027849: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1344] Found > device 0 with properties: name: NVIDIA Tegra X2 major: 6 minor: 2 memoryClockRate(GHz): 1.3005 pciBusID: 0000:00:00.0 totalMemory: 7.67GiB freeMemory: 1.83GiB

Is the maximum value of the workspace set by RT, it will be occupied, and other programs can no longer be used?

I am only running the VGG-16 model on the TX2 with tensorRT,and the max_workspace_size_bytes=4096 << 20. When the model is running, the output is Cuda Error in execute: 9. There is my output logger:

Instructions for updating: Use the retry module or similar alternatives. 2018-07-20 08:45:50.186786: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:865] ARM64 does not support NUMA - returning NUMA node zero 2018-07-20 08:45:50.186957: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1344] Found device 0 with properties: name: NVIDIA Tegra X2 major: 6 minor: 2 memoryClockRate(GHz): 1.3005 pciBusID: 0000:00:00.0 totalMemory: 7.67GiB freeMemory: 2.55GiB 2018-07-20 08:45:50.187010: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1423] Adding visible gpu devices: 0 2018-07-20 08:45:50.187092: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix: 2018-07-20 08:45:50.187131: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917] 0 2018-07-20 08:45:50.187159: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0: N 2018-07-20 08:45:50.187293: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Created > TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 2316 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2) 2018-07-20 08:47:14.036891: E tensorflow/contrib/tensorrt/log/trt_logger.cc:38] DefaultLogger cudnnConvolutionLayer.cpp (254) - Cuda Error in execute: 9

Do you know what is going on here? Thanks a lot.

peterlee0127 commented 6 years ago

Do you try allowing_gpu_memory_growth ? Because of the NVIDIA Jetson use the share memory of CPU and GPU, sometimes you can try to reboot and disable a desktop environment. I also have a similar problem too. Just try to reboot the system to free other memory.

haiyang-tju commented 6 years ago

Yes, I have set it with True, and reboot does not work for me. When the script is running, it will fill up the memory, and I can do nothing. Then I rewrite it with C++ API of tensorRT, it works well. I don’t understand why.

alejandroandreu commented 6 years ago

Hi,

I used to build TF with TensorRT support enabled and my performance was awful. Removing it enabled my program to run with a much higher performance. Like, 3x times better.

Not sure if you can afford to do this but might be worth a shot.

-------- Original Message -------- On Jul 20, 2018, 14:16, 海洋@TJU wrote:

Yes, I have set it with True, and reboot does not work for me. When the script is running, it will fill up the memory, and I can do nothing. Then I rewrite it with C++ API of tensorRT, it works well. I don’t understand why.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

peterlee0127 commented 6 years ago

Cool. It seems we also need a build without tensorrt. In fact, the tensorrt tensor flow in Jetson is limited. Only a few feature can work.

asinha94 commented 6 years ago

@haiyang-tju Hi I've been trying to write a C++ program to perform TensorRT optimization as well, because the python script seemed to increase memory usage on the Jetson, could you possibly share that piece of code with me, Im not too familiar with the C++ API. Thanks!

haiyang-tju commented 6 years ago

@asinha94 You can try the sample code of TensorRT-4.0.0.3 in here. Maybe you need to log in the Nvidia account. The sample code in this path:

***\TensorRT-4.0.0.3\targets\x86_64-linux-gnu\samples

If you can not find this file, leave a mail and I will send it to you.