sacmehta / ESPNet

ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation
https://sacmehta.github.io/ESPNet/
MIT License
538 stars 111 forks source link

How to test data on Jetson TX2? #54

Closed JingliangGao closed 5 years ago

JingliangGao commented 5 years ago

Hi, In your paper , you mentioned that you have successfully tested Memory efficiency and Sensitivity to GPU frequency in Jetson TX2. I' m quite new in TX2, and I just wonder that how to do that? BTW, for Power Consumption, did you externally connect a power meter to the TX2 or just use some magical commands to test?

JingliangGao commented 5 years ago

@sacmehta

sacmehta commented 5 years ago

Hi,

To measure a sensitivity, you need to fix the frequency and then see the change in inference time. For example, if I am interested in seeing a senstivity to GPU frequency, what I will do is choose a set of GPU frequencies between min and max values supported by my hardware. Let us take we are interested in GPU_FREQ=[300, 500]. Now, I will fix these frequencies one by one and then measure inference time at these frequecies. The ratio of %age change in execution time to %age change in GPU freq is your sensitivity.

Each hardware has various performance counters that provide information about different components of the hardware, including CPU usage, memory usage, and power consumption. You need to read those counters to extract the required information. For more TX2 related details, please post in NVIDIA's forum.

JingliangGao commented 5 years ago

Firstly, thanks for your prompt reply.

1) From what you said above, you mentioned that you can choose a set of GPU frequencies between min and max values . As far as I am concerned, the Jetson TX2 has 5 modes of operation corresponding to 3 GPU frequency(870 Mhz, 1147 MhZ, 1331Mhz). Did you mean we can use commands to set GPU frequency at any value between [870, 1331] ? Which command ? 2) We all know that Jetson TX2 do not support nvidia-smi to watch CPU/GPU utilization and power consumption directly, instead, I use htop and sudo ~/tegrastats but still can not observe the data we need. So how did you solve the problem ? 3) What deeply bothered me is that how we counter power consumption (Watts) . I have tried applications like powertop, powerstatic and s-tui , but all of them only support device which is discharging. Which application or commands did you use ?

I was wondering if you could provide some more details . @sacmehta

sacmehta commented 5 years ago

1 use nvpmodel for setting GPU frequency.

‘’’ sudo nvpmodel -m [mode]‘’’

  1. We measured power when device is in discharge mode i.e when connected to a battery. Please ensure that battery is fully charged during each experiment. PowerTop and other utilities are good for Intel-based processors, but I am not sure for ARM-based processors. We read performance counters directly from TX2. If you need more help with that, please check or post at NVIDIA’s forums.
JingliangGao commented 5 years ago

Thanks again, I believe that your advice are very helpful.

  1. Actually, I was confused about the definition of sensitivity you gave , which is that the ratio of %age change in execution time to %age change in GPU freq . Well, in your original papar, for ESPNet's inference speed, the FPS value are almost 8 (0.125 s) and 9 (0.111 s) corresponding to two different GPU freq, 1134 and 1300 . Next, we calculate the sensitivity , that is [(0.125 - 0.111)/0.125] / [|1134 - 1300|/1134] = 76.6% << 95% (original papar)

  2. It is difficult for us to use htop or sudo ~/tegrastats command to watch total CPU utilization and GPU utilization . Do you have another creative idea?

  3. For Memory Efficiency, I am still curious about how to test Global Load , Shared Memory , etc . By some Linux magical commands ? or another way?

I really appreciate the advice you gave above and hope your reply soon . :)

sacmehta commented 5 years ago
  1. The difference in number is because you did not use the correct timing information, but the procedure is correct.

  2. You need to create a bash script that first starts your program and then launches tegrastats.

  3. You need to use nvidia's profiler for memory efficiency statistics

JingliangGao commented 5 years ago

Hi, I want to test memory efficiency statistics as shown below .

1

As you mentioned above, I used the command **nvprof ./run.sh --csv --metrics warp_execution_efficiency** . Here, command "python3 eval_forwardTime.py" has already in run.sh.

Unexpectedly, we can not test anything with the warning that No CUDA application was profiled. The reason might be that run.sh or eval_forwardTime.py is not a cuda file. But, I wrote the file which was based on pytorch and used cuda in it . In other words, should I write a cuda file to test memory efficiency statistics ? So how do you test Global load , Global store, Shared Memory and warp_execution_efficiency ? which command ?

I just wondering if you could provide some more details . @sacmehta

sacmehta commented 5 years ago

You need to do “system-wide profiling” instead of application specific profiling to generate the trace on TX2. Once trace is generated, you need to analyze it.

I understand that you might be feeling annoyed, but please understand that your questions are beyond this repository and I won’t be able to answer them (for obvious reasons though). Please use NVIDIA forums for all hardware and tracing related questions. They have specialized teams that can help you resolve your issues.

If you have any questions related to our model, I would be happy to help you.

Good Luck!

sacmehta commented 5 years ago

I am closing this issue because this issue is beyond the scope of this repository.