请问一张720P的图推理需要20多秒是否正常速度？

TimeOnCloud commented 1 year ago

如标题所示，我使用3080卡，运行test.py时，融合720P图片速度是一张20多秒，480P的是8秒左右，请问是否为正常速度？

环境除python版本是3.6.9外，其它一致

以下是融合640-480P图的运行情况： (wl) tc@tc-Super-Server:/home/wl/DIVFusion-main$ CUDA_VISIBLE_DEVICES=0 python test.py /home/tc/anaconda3/envs/wl/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /home/tc/anaconda3/envs/wl/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /home/tc/anaconda3/envs/wl/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /home/tc/anaconda3/envs/wl/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /home/tc/anaconda3/envs/wl/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /home/tc/anaconda3/envs/wl/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) /home/tc/anaconda3/envs/wl/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /home/tc/anaconda3/envs/wl/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /home/tc/anaconda3/envs/wl/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /home/tc/anaconda3/envs/wl/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:544: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /home/tc/anaconda3/envs/wl/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:545: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /home/tc/anaconda3/envs/wl/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:550: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) WARNING:tensorflow:From test.py:191: The name tf.train.NewCheckpointReader is deprecated. Please use tf.compat.v1.train.NewCheckpointReader instead.

WARNING:tensorflow:From test.py:533: The name tf.InteractiveSession is deprecated. Please use tf.compat.v1.InteractiveSession instead.

2023-02-17 17:43:42.885650: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 2023-02-17 17:43:42.889101: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1 2023-02-17 17:43:42.972511: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-02-17 17:43:42.972871: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x556513477f40 executing computations on platform CUDA. Devices: 2023-02-17 17:43:42.972883: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): NVIDIA GeForce RTX 3080, Compute Capability 8.6 2023-02-17 17:43:42.974427: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3099995000 Hz 2023-02-17 17:43:42.974723: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x556510d72490 executing computations on platform Host. Devices: 2023-02-17 17:43:42.974734: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): , 2023-02-17 17:43:42.974824: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-02-17 17:43:42.975163: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: name: NVIDIA GeForce RTX 3080 major: 8 minor: 6 memoryClockRate(GHz): 1.71 pciBusID: 0000:01:00.0 2023-02-17 17:43:42.975231: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcudart.so.10.0'; dlerror: libcudart.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/tc/anaconda3/envs/wl/lib/python3.6/site-packages/cv2/../../lib: 2023-02-17 17:43:42.975271: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcublas.so.10.0'; dlerror: libcublas.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/tc/anaconda3/envs/wl/lib/python3.6/site-packages/cv2/../../lib: 2023-02-17 17:43:42.975305: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcufft.so.10.0'; dlerror: libcufft.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/tc/anaconda3/envs/wl/lib/python3.6/site-packages/cv2/../../lib: 2023-02-17 17:43:42.975343: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcurand.so.10.0'; dlerror: libcurand.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/tc/anaconda3/envs/wl/lib/python3.6/site-packages/cv2/../../lib: 2023-02-17 17:43:42.975375: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcusolver.so.10.0'; dlerror: libcusolver.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/tc/anaconda3/envs/wl/lib/python3.6/site-packages/cv2/../../lib: 2023-02-17 17:43:42.975411: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcusparse.so.10.0'; dlerror: libcusparse.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/tc/anaconda3/envs/wl/lib/python3.6/site-packages/cv2/../../lib: 2023-02-17 17:43:42.975443: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcudnn.so.7'; dlerror: libcudnn.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/tc/anaconda3/envs/wl/lib/python3.6/site-packages/cv2/../../lib: 2023-02-17 17:43:42.975467: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1663] Cannot dlopen some GPU libraries. Skipping registering GPU devices... 2023-02-17 17:43:42.975478: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix: 2023-02-17 17:43:42.975483: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0 2023-02-17 17:43:42.975488: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N WARNING:tensorflow:From test.py:534: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

WARNING:tensorflow:From test.py:546: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.

WARNING:tensorflow:From test.py:198: The name tf.get_variable is deprecated. Please use tf.compat.v1.get_variable instead.

WARNING:tensorflow: The TensorFlow contrib module will not be included in TensorFlow 2.0. For more information, please see:

https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
https://github.com/tensorflow/addons
https://github.com/tensorflow/io (for I/O related ops) If you depend on functionality not listed there, please file an issue.

WARNING:tensorflow:From test.py:66: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Deprecated in favor of operator or tf.math.divide. WARNING:tensorflow:From test.py:564: The name tf.global_variables_initializer is deprecated. Please use tf.compat.v1.global_variables_initializer instead.

2023-02-17 17:43:43.764500: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1412] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set. If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU. To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile. [*] Initialize model successfully... Testing [0] success,Testing time is [8.578033] Testing [1] success,Testing time is [17.136145]

Xinyu-Xiang commented 1 year ago

我这边测试的时候，按道理来说应该比一些融合方法慢，但是大约10807003左右大小的图像融合，需要耗时1秒左右，应该不会出现这么久的叭，使用的是titan RTX显存24G版本的

xinyuxiang @.***

安徽大学学生

------------------ 原始邮件 ------------------ 发件人: "Xinyu-Xiang/DIVFusion" @.>; 发送时间: 2023年2月17日(星期五) 下午5:41 @.>; @.***>; 主题: [Xinyu-Xiang/DIVFusion] 请问一张720P的图推理需要20多秒是否正常速度？ (Issue #1)

如标题所示，我在相同的环境配置下使用3080卡，运行test.py时，融合速度是一张20多秒，请问是否为正常速度？

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>

Xinyu-Xiang commented 1 year ago

请问一下您那边的显存大概多少呢？如果显存不足的话可能跑起来会慢一些，当然也有可能是没有加载到显存上，直接用CPU跑的话，速度确实会打折扣的

xinyuxiang @.***

安徽大学学生

------------------ 原始邮件 ------------------ 发件人: "Xinyu-Xiang/DIVFusion" @.>; 发送时间: 2023年2月17日(星期五) 下午5:41 @.>; @.***>; 主题: [Xinyu-Xiang/DIVFusion] 请问一张720P的图推理需要20多秒是否正常速度？ (Issue #1)

如标题所示，我在相同的环境配置下使用3080卡，运行test.py时，融合速度是一张20多秒，请问是否为正常速度？

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>

TimeOnCloud commented 1 year ago

我重新看了下显存的占用情况，只用到了200多M，显存是10018MiB，跑的时候CPU满了，那应该是正如你所说的，没成功加载到显卡上运行

Xinyu-Xiang commented 1 year ago

嗯嗯嗯，好滴好滴，应该之后成功加载到显卡上就能解决速度慢的问题了的

xinyuxiang @.***

安徽大学学生

------------------ 原始邮件 ------------------ 发件人: "Xinyu-Xiang/DIVFusion" @.>; 发送时间: 2023年2月17日(星期五) 晚上6:19 @.>; @.**@.>; 主题: Re: [Xinyu-Xiang/DIVFusion] 请问一张720P的图推理需要20多秒是否正常速度？ (Issue #1)

我重新看了下显存的占用情况，只用到了200多M，显存是10018MiB，跑的时候CPU满了，那应该是正如你所说的，没成功加载到显卡上运行

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

19958595306 commented 1 year ago

作者您好，我在跑您写的PIAFusion的时候也遇到了同样的问题，训练时全都使用cpu在跑，没能调用显卡，不知道该怎么解决

Xinyu-Xiang / DIVFusion

请问一张720P的图推理需要20多秒是否正常速度？ #1