koide3 / monocular_person_following

Monocular camera-based person tracking and identification ROS framework for person following robots
175 stars 48 forks source link

Question about webcam demo when using GPU #15

Closed KoreanPro closed 2 years ago

KoreanPro commented 2 years ago

Hello. Thank you for sharing your hard works.

I tried your https://github.com/koide3/monocular_person_following/wiki/Quick-test-with-USB-cam

and have two questions.

I have tested in below conditions.

ubuntu 20.04 / nvidia 460 / cuda 11.2 / cudnn 8.1.0 / Graphic Card Geforce RTX 3060.

I have tested with below commands.


When I run last line

sudo docker run -it --rm \ --net host \ koide3/monocular_person_following:noetic \ roslaunch monocular_person_following jetson_person_following.launch camera_name:=/top_front_camera/qhd allow_growth:=true

Without "--gpus all" all it worked well. (Open Pose have shown up and person tracking worked well also)


However when I run with "--gpus all" I have two issues.

  1. Webcam image appears after about 30 minutes.(Is it because RTX 3060? Will it be faster when I use better Graphic Card?)
  2. Image showed up after 30 minutes but it seemed like it couldn't use OpenPose. (below image)


At this time, I suspect my low graphic card specification is problem, but I was wondering if there is something else I haven't done.

Thank you again for sharing your hard work and hope to hear from you.

koide3 commented 2 years ago

I guess there are some problems on the GPU handling on docker. Can you see if your GPU card is visible to your docker with the following command?

docker run --gpus all -it --rm --net host koide3/monocular_person_following:noetic nvidia-smi

Wed Oct  6 04:31:37 2021       
| NVIDIA-SMI 460.91.03    Driver Version: 460.91.03    CUDA Version: 11.2     |
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  GeForce RTX 208...  Off  | 00000000:01:00.0  On |                  N/A |
| 39%   52C    P0    67W / 250W |   2022MiB / 10985MiB |      4%      Default |
|                               |                      |                  N/A |

| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
KoreanPro commented 2 years ago

Hi. Thank you for reply. This is what I get when I run the command.


To give more information this is printing when I run command with "--gpus all"

devk@devk:~$ sudo docker run -it --rm --net host --gpus all koide3/monocular_person_following:noetic roslaunch monocular_person_following jetson_person_following.launch camera_name:=/top_front_camera/qhd allow_growth:=true sourcing /opt/ros/noetic/setup.bash ROS_ROOT /opt/ros/noetic/share/ros ROS_DISTRO noetic ... logging to /root/.ros/log/bfd66652-2647-11ec-bf3a-2dc0d3b3fd90/roslaunch-devk-1.log Checking log directory for disk usage. This may take a while. Press Ctrl-C to interrupt Done checking log file disk usage. Usage is <1GB.

started roslaunch server http://devk:33259/



NODES / compress_visualize (image_transport/republish) monocular_people_tracking (monocular_people_tracking/monocular_people_tracking_node) monocular_person_following (monocular_person_following/monocular_person_following_node) pose_estimator (tfpose_ros/broadcaster_ros.py) simple_gesture_recognition (monocular_person_following/simple_gesture_recognition.py) throttle_visualize (topic_tools/throttle) visualization_node (monocular_person_following/visualization.py)


process[pose_estimator-1]: started with pid [78] process[monocular_people_tracking-2]: started with pid [79] process[monocular_person_following-3]: started with pid [80] process[visualization_node-4]: started with pid [81] process[compress_visualize-5]: started with pid [82] process[throttle_visualize-6]: started with pid [83] process[simple_gesture_recognition-7]: started with pid [84] [ INFO] [1633497416.454981917]: construct body classifier [ INFO] [1633497416.455790285]: add cnn10 to channel bank --- simple_gesture_recognition --- wait for service done WARNING:tensorflow: The TensorFlow contrib module will not be included in TensorFlow 2.0. For more information, please see:

WARNING:tensorflow:From /root/catkin_ws/src/tf-pose-estimation/tf_pose/mobilenet/mobilenet.py:369: The name tf.nn.avg_pool is deprecated. Please use tf.nn.avg_pool2d instead.

WARNING:tensorflow: The TensorFlow contrib module will not be included in TensorFlow 2.0. For more information, please see:

WARNING:tensorflow:From /root/catkin_ws/src/tf-pose-estimation/tf_pose/mobilenet/mobilenet.py:369: The name tf.nn.avg_pool is deprecated. Please use tf.nn.avg_pool2d instead.

WARNING:tensorflow:From /root/catkin_ws/src/tf-pose-estimation/scripts/broadcaster_ros.py:92: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

[2021-10-06 05:16:58,071] [TfPoseEstimator] [INFO] loading graph from /root/catkin_ws/src/tf-pose-estimation/models/graph/mobilenet_thin/graph_opt.pb(default size=656x368) WARNING:tensorflow:From /root/catkin_ws/src/tf-pose-estimation/tf_pose/estimator.py:310: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.

WARNING:tensorflow:From /root/catkin_ws/src/tf-pose-estimation/tf_pose/estimator.py:311: The name tf.GraphDef is deprecated. Please use tf.compat.v1.GraphDef instead.

WARNING:tensorflow:From /root/catkin_ws/src/tf-pose-estimation/tf_pose/estimator.py:314: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

WARNING:tensorflow:From /root/catkin_ws/src/tf-pose-estimation/tf_pose/estimator.py:316: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

2021-10-06 05:16:58.145114: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA 2021-10-06 05:16:58.168232: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2496000000 Hz 2021-10-06 05:16:58.168695: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x60823d0 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2021-10-06 05:16:58.168717: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version 2021-10-06 05:16:58.170299: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1 2021-10-06 05:16:58.276273: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-10-06 05:16:58.276970: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7af54f0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices: 2021-10-06 05:16:58.277043: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce RTX 3060, Compute Capability 8.6 2021-10-06 05:16:58.277414: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-10-06 05:16:58.277843: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Found device 0 with properties: name: GeForce RTX 3060 major: 8 minor: 6 memoryClockRate(GHz): 1.777 pciBusID: 0000:01:00.0 2021-10-06 05:16:58.278247: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0 2021-10-06 05:16:58.290107: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0 2021-10-06 05:16:58.295927: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0 2021-10-06 05:16:58.298071: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0 2021-10-06 05:16:58.312462: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0 2021-10-06 05:16:58.321122: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0 2021-10-06 05:16:58.349487: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7 2021-10-06 05:16:58.349702: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-10-06 05:16:58.350217: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-10-06 05:16:58.350601: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1767] Adding visible gpu devices: 0 2021-10-06 05:16:58.351026: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0 2021-10-06 05:16:58.352051: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1180] Device interconnect StreamExecutor with strength 1 edge matrix: 2021-10-06 05:16:58.352064: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1186] 0 2021-10-06 05:16:58.352069: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1199] 0: N 2021-10-06 05:16:58.352156: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-10-06 05:16:58.352576: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-10-06 05:16:58.352970: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1325] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10361 MB memory) -> physical GPU (device: 0, name: GeForce RTX 3060, pci bus id: 0000:01:00.0, compute capability: 8.6) WARNING:tensorflow:From /root/catkin_ws/src/tf-pose-estimation/tf_pose/estimator.py:327: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

WARNING:tensorflow:From /root/catkin_ws/src/tf-pose-estimation/tf_pose/estimator.py:328: The name tf.image.resize_area is deprecated. Please use tf.compat.v1.image.resize_area instead.

WARNING:tensorflow:From /root/catkin_ws/src/tf-pose-estimation/tf_pose/tensblur/smoother.py:92: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.

WARNING:tensorflow:From /root/catkin_ws/src/tf-pose-estimation/tf_pose/estimator.py:337: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.where in 2.0, which has the same broadcast rule as np.where WARNING:tensorflow:From /root/catkin_ws/src/tf-pose-estimation/tf_pose/estimator.py:342: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead.

WARNING:tensorflow:From /root/catkin_ws/src/tf-pose-estimation/tf_pose/estimator.py:343: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.

WARNING:tensorflow:From /root/catkin_ws/src/tf-pose-estimation/tf_pose/estimator.py:345: The name tf.report_uninitialized_variables is deprecated. Please use tf.compat.v1.report_uninitialized_variables instead.

2021-10-06 05:20:57.452051: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7 2021-10-06 05:34:33.766450: W tensorflow/stream_executor/cuda/redzone_allocator.cc:312] Internal: ptxas exited with non-zero error code 65280, output: ptxas fatal : Value 'sm_86' is not defined for option 'gpu-name'

Relying on driver to perform ptx compilation. This message will be only logged once. 2021-10-06 05:34:34.229074: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0

MuhammadShifa commented 9 months ago

Hello @KoreanPro, I am also getting the same results like you, but I don't know what to do next, how to check the output of model after running the container?. I am ML Engineer and new to the Ros. If you can help me to see the output would be great, thanks in advance.