Open MichaelToLearn opened 1 year ago
https://github.com/lenLRX/Atlas_ACL_E2E_Demo/blob/master/yolov5_model_cvt.md 文档中,“YOLOv5 v5.0后的某个版本” 需要剔除 focus 层,对于 v6.0 也需要剔除吗?看标题是两个大标题,不知道对于 v6.0 是否也需要剔除 focus 层。
https://github.com/lenLRX/Atlas_ACL_E2E_Demo/blob/master/yolov5_model_cvt.md 文档中,“YOLOv5 v5.0后的某个版本” 需要剔除 focus 层,对于 v6.0 也需要剔除吗?看标题是两个大标题,不知道对于 v6.0 是否也需要剔除 focus 层。
v6.0不需要
这里有几个问题: 第一个问题:我偷懒没有支持好H265,除了你改的那个bitstream filter之外,DVPP相关的API应该也需要改,应该还是有其他坑。 第二个问题:分辨率,你这个h: 1440 w:2560的分辨率解码应该是没有问题的,但是硬件编码应该不支持这个分辨率,具体参考官方文档 软件编码的话要把配置文件的"hw_encoder": true改成false
我建议先使用一个比较短的H264 1080P的视频做一个简单的测试,然后再测试H2641080P的摄像头,最后你再对代码做一些修改,适配到你需要的分辨率和编码方式
谢谢提示,我搞一下,有进展可以这里同步 ~
这里有几个问题: 第一个问题:我偷懒没有支持好H265,除了你改的那个bitstream filter之外,DVPP相关的API应该也需要改,应该还是有其他坑。 第二个问题:分辨率,你这个h: 1440 w:2560的分辨率解码应该是没有问题的,但是硬件编码应该不支持这个分辨率,具体参考官方文档 软件编码的话要把配置文件的"hw_encoder": true改成false
我建议先使用一个比较短的H264 1080P的视频做一个简单的测试,然后再测试H2641080P的摄像头,最后你再对代码做一些修改,适配到你需要的分辨率和编码方式
试了试,使用未经修改的原始工程文件,普通的 h264 的 mp4 文件,1080p 的,貌似也是报 Segmentation fault
。
.1. 视频文件
video.mp4 文件 → video.mp4
{
"streams": [
{
"name": "yolov5_demo_stream1",
"stream_type": "yolov5_demo",
"src": "video.mp4",
"dst": "video_yolov5_v6.mp4",
"model_height" : 640,
"model_width" : 640,
"model_box_num": 25220,
"model_class_num": 80,
"hw_encoder": true,
"enable_neon": false,
"yolov5_version": "v6",
"yolov5_model_path": "./model/yolov5s_v6.om"
}
],
"config": {
"app_perf": true,
"perflog_path": "."
}
}
➜ Atlas_ACL_E2E_Demo ./run.sh config/yolov5_v6_mp4_demo.json
total dev count: 1
[FFMPEGInput::Init] video.mp4 codec name:h264
avcc profile: 100
frame h: 1080 frame w: 1920
ticks_per_frame: 2
framerate.num: 10
framerate.den: 1
ref frame num: 1
has B frame: 0
pix format: 0
codec_tag 828601953
extra_data size: 38
[DvppDecoder::Init] h: 1080 w:1920output_size: 3110400
./run.sh: line 15: 22933 Segmentation fault ./build/acl_demo_app -c $1
HwHiAiUser@davinci-mini:~/original/Atlas_ACL_E2E_Demo$ ./run.sh config/yolov5_v6_demo_issue52.json total dev count: 1 [FFMPEGInput::Init] test_failed_video.mp4 codec name:h264 avcc profile: 100 frame h: 1080 frame w: 1920 ticks_per_frame: 2 framerate.num: 10 framerate.den: 1 ref frame num: 1 has B frame: 0 pix format: 0 codec_tag 828601953 extra_data size: 38 [DvppDecoder::Init] h: 1080 w:1920output_size: 3110400 YOLOv5 Model Info: ACLModel:./model/yolov5s_v6.om Input Num:1 Input shapes:614400, Output Num:4 Output shapes:8568000, 6528000, 1632000, 408000, FFMPEGOutput::Init frame send interval: 0.1 [libx264 @ 0xfffee55f66a0] using cpu capabilities: ARMv8 NEON [libx264 @ 0xfffee55f66a0] profile High, level 4.0 [libx264 @ 0xfffee55f66a0] 264 - core 152 r2854 e9a5903 - H.264/MPEG-4 AVC codec - Copyleft 2003-2017 - http://www.videolan.org/x264.html - options: cabac=1 ref=1 deblock=1:0:0 analyse=0x3:0x3 me=dia subme=1 psy=1 psy_rd=1.00:0.00 mixed_ref=0 me_range=16 chroma_me=1 trellis=0 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=0 threads=4 lookahead_threads=4 sliced_threads=1 slices=4 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=0 weightp=1 keyint=12 keyint_min=1 scenecut=40 intra_refresh=0 rc=crf mbtree=0 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00 Output #0, mp4, to 'test_failed_video_out.mp4': Stream #0:0: Video: h264 (libx264), nv12, 1920x1080, q=-1--1 [FFMPEGOutput::Init] expected frame size 3110400 [DvppEncoder::Init] height:1080 width: 1920 size: 3110400 /usr/local/lib/python3.6/dist-packages/torchvision/io/image.py:11: UserWarning: Failed to load image Python extension: warn(f"Failed to load image Python extension: {e}") [FFMPEGInput::ReceivePacketWithBSF] err string: End of file End of stream input: stream[0] test_failed_video.mp4 AppProfiler shutdown
我这能跑的通你这个视频,你再确认一下om是不是在对应的路径下,执行的账户是否权限读取。 还有问题的话,你就装个gdb看一下具体是哪里挂了吧
挂的位置:调试发现,挂在了 DvppDecoder::Init
的 aclvdecCreateChannel(channel_desc);
这一步:
om 的存在问题:om 文件是存在的,加了个判断是否存在的函数也通过判断了:
CMake 的编译问题? 原始的 CMakeLists.txt 在我这里是没法编译的,主要有: a. acllib 路径问题 → 改成了板子上的具体路径 b. python 版本问题 → 新版的制卡文档上已经推荐使用 python3.7.5 了,原来的 python3.6 的相关的都替换为了自己编译的 3.7.5 的
对 CMakeList.txt 的变更为:
include_directories
include_directories(
/usr/local/Ascend/include # for peripheral_api.h
${ACL_PATH}/include # acl hearders
/usr/local/python3.7.5/include/python3.7m/ # for python headers → 变更
/usr/local/python3.7.5/lib/python3.7/site-packages/numpy/core/include/ # for pip installed numpy headers → 变更
${CMAKE_SOURCE_DIR}/src
${CMAKE_SOURCE_DIR}/bytetrack_csrc/include
${FREETYPE_INCLUDE_DIRS})
link_directories
link_directories(
/usr/lib64/
${ACL_PATH}/lib64/
/usr/local/python3.7.5/lib/ → 新增
)
deep_sort 相关 这里主动开启了,以让它把 python 相关的加进去:
# 打印是否开启 deepsort
set(BUILD_DEEP_SORT ON)
message(STATUS "BUILD_DEEP_SORT: ${BUILD_DEEP_SORT}")
if (BUILD_DEEP_SORT)
include_directories(/usr/local/python3.7.5/include/)
endif (BUILD_DEEP_SORT)
对了,所有操作都是在 root 下进行的。
还有两个很重要的问题没有问,你的硬件版本和软件版本是啥?
然后还有一个可以分析的就是ACL自己的日志, 以我的环境为例: /var/log/npu/slog/device-app-5396 这些路径下你看一下有没有什么报错
对了,所有操作都是在 root 下进行的。
试了试,HwHiAiUser 也会 Segmentation fault
还有两个很重要的问题没有问,你的硬件版本和软件版本是啥?
然后还有一个可以分析的就是ACL自己的日志, 以我的环境为例: /var/log/npu/slog/device-app-5396 这些路径下你看一下有没有什么报错
版本是这个:
$ npu-smi info
+--------------------------------------------------------------------------------------------------------+
| npu-smi 22.0.4 Version: 22.0.4 |
+-------------------------------+-----------------+------------------------------------------------------+
| NPU Name | Health | Power(W) Temp(C) Hugepages-Usage(page) |
| Chip Device | Bus-Id | AICore(%) Memory-Usage(MB) |
+===============================+=================+======================================================+
| 0 310 | OK | 12.8 37 9 / 970 |
| 0 0 | NA | 0 3654 / 7760 |
+===============================+=================+======================================================
➜ latest cat /home/HwHiAiUser/Ascend/ascend-toolkit/6.0.1/toolkit/version.info Version=1.84.15.1.310 version_dir=6.0.1
貌似没有 app 级别的日志。。
➜ slog pwd
/var/log/npu/slog
➜ slog ls
alternatives.log bootstrap.log debug device-os faillog lastlog operation security sysstat wtmp
apt btmp device-0 dpkg.log journal lost+found run slogd tallylog
貌似没有 app 级别的日志。。
➜ slog pwd /var/log/npu/slog ➜ slog ls alternatives.log bootstrap.log debug device-os faillog lastlog operation security sysstat wtmp apt btmp device-0 dpkg.log journal lost+found run slogd tallylog
看一下device-0目录,或者device-os目录
还有两个很重要的问题没有问,你的硬件版本和软件版本是啥? 然后还有一个可以分析的就是ACL自己的日志, 以我的环境为例: /var/log/npu/slog/device-app-5396 这些路径下你看一下有没有什么报错
版本是这个:
$ npu-smi info +--------------------------------------------------------------------------------------------------------+ | npu-smi 22.0.4 Version: 22.0.4 | +-------------------------------+-----------------+------------------------------------------------------+ | NPU Name | Health | Power(W) Temp(C) Hugepages-Usage(page) | | Chip Device | Bus-Id | AICore(%) Memory-Usage(MB) | +===============================+=================+======================================================+ | 0 310 | OK | 12.8 37 9 / 970 | | 0 0 | NA | 0 3654 / 7760 | +===============================+=================+======================================================
我的是Atlas200DK,和Atlas200是同样的芯片,但是环境本身略有不同
貌似没有 app 级别的日志。。
➜ slog pwd /var/log/npu/slog ➜ slog ls alternatives.log bootstrap.log debug device-os faillog lastlog operation security sysstat wtmp apt btmp device-0 dpkg.log journal lost+found run slogd tallylog
看一下device-0目录,或者device-os目录
在 debug 目录下找到了日志,该日志是清空日志后,重新执行后生成的: /var/log/npu/slog/debug/device-app-26824/device-app-26824_20230312144157650.log
[EVENT] PROFILING(26824,acl_demo_app):2023-03-12-14:41:56.793.290 [msprof_callback_impl.cpp:199] >>> (tid:26824) Started to register profiling ctrl callback.
[EVENT] PROFILING(26824,acl_demo_app):2023-03-12-14:41:57.337.635 [msprof_callback_impl.cpp:78] >>> (tid:26824) MsprofCtrlCallback called, type: 255
[EVENT] PROFILING(26824,acl_demo_app):2023-03-12-14:41:57.337.780 [prof_acl_mgr.cpp:1190] >>> (tid:26824) Init profiling for dynamic profiling
[EVENT] DVPP(26824,acl_demo_app):2023-03-12-14:41:57.641.472 [DvppManager.cpp:323][API] [InitStatThread:323] [T208] Successed to create statistic thread(e7ffb25593b0).
[EVENT] DVPP(26824,acl_demo_app):2023-03-12-14:41:57.643.111 [JpegdAsyncManager.cpp:86][API] [Init:86] [T208] Successed to create jpegd async work thread(0), thread id = 255085394494384.
[EVENT] DVPP(26824,acl_demo_app):2023-03-12-14:41:57.643.349 [JpegdAsyncManager.cpp:86][API] [Init:86] [T208] Successed to create jpegd async work thread(1), thread id = 255085394359216.
[EVENT] DVPP(26824,acl_demo_app):2023-03-12-14:41:57.643.568 [JpegeAsyncManager.cpp:90][API] [Init:90] [T208] Successed to create jpege async thread(0), thread id = 255085360645040.
[EVENT] DVPP(26824,acl_demo_app):2023-03-12-14:41:57.643.784 [PngdAsyncManager.cpp:86][API] [Init:86] [T208] Successed to create pngd async thread(0), thread id = 255085360509872.
[EVENT] DVPP(26824,acl_demo_app):2023-03-12-14:41:57.643.963 [PngdAsyncManager.cpp:86][API] [Init:86] [T208] Successed to create pngd async thread(1), thread id = 255085360374704.
[EVENT] DVPP(26824,acl_demo_app):2023-03-12-14:41:57.644.153 [PngdAsyncManager.cpp:86][API] [Init:86] [T208] Successed to create pngd async thread(2), thread id = 255085360239536.
[EVENT] DVPP(26824,acl_demo_app):2023-03-12-14:41:57.644.344 [PngdAsyncManager.cpp:86][API] [Init:86] [T208] Successed to create pngd async thread(3), thread id = 255085360104368.
[EVENT] DVPP(26824,acl_demo_app):2023-03-12-14:41:57.644.527 [PngdAsyncManager.cpp:86][API] [Init:86] [T208] Successed to create pngd async thread(4), thread id = 255085359969200.
[EVENT] DVPP(26824,acl_demo_app):2023-03-12-14:41:57.644.708 [PngdAsyncManager.cpp:86][API] [Init:86] [T208] Successed to create pngd async thread(5), thread id = 255085359834032.
[EVENT] DVPP(26824,acl_demo_app):2023-03-12-14:41:57.645.102 [VpcAsyncManager.cpp:180][API] [Init:180] [T208] Successed to create vpc async work thread(255085357552560).
[EVENT] DVPP(26824,acl_demo_app):2023-03-12-14:41:57.645.426 [VpcAsyncManager.cpp:180][API] [Init:180] [T208] Successed to create vpc async work thread(255085357417392).
[EVENT] DVPP(26824,acl_demo_app):2023-03-12-14:41:57.645.751 [VpcAsyncManager.cpp:180][API] [Init:180] [T208] Successed to create vpc async work thread(255085357282224).
[EVENT] DVPP(26824,acl_demo_app):2023-03-12-14:41:57.646.067 [VpcAsyncManager.cpp:180][API] [Init:180] [T208] Successed to create vpc async work thread(255085357147056).
[EVENT] DVPP(26824,acl_demo_app):2023-03-12-14:41:57.646.582 [CmdListManager.cpp:142][CMDLIST] [Init:142] [T208] Successed to create cmdlist sync work thread(255085357011888).
[EVENT] DVPP(26824,acl_demo_app):2023-03-12-14:41:57.646.705 [CmdListManager.cpp:142][CMDLIST] [Init:142] [T208] Successed to create cmdlist sync work thread(255085356876720).
[EVENT] DVPP(26824,acl_demo_app):2023-03-12-14:41:57.646.827 [CmdListManager.cpp:142][CMDLIST] [Init:142] [T208] Successed to create cmdlist sync work thread(255085356741552).
[EVENT] DVPP(26824,acl_demo_app):2023-03-12-14:41:57.646.943 [CmdListManager.cpp:142][CMDLIST] [Init:142] [T208] Successed to create cmdlist sync work thread(255085356606384).
还有两个很重要的问题没有问,你的硬件版本和软件版本是啥? 然后还有一个可以分析的就是ACL自己的日志, 以我的环境为例: /var/log/npu/slog/device-app-5396 这些路径下你看一下有没有什么报错
版本是这个:
$ npu-smi info +--------------------------------------------------------------------------------------------------------+ | npu-smi 22.0.4 Version: 22.0.4 | +-------------------------------+-----------------+------------------------------------------------------+ | NPU Name | Health | Power(W) Temp(C) Hugepages-Usage(page) | | Chip Device | Bus-Id | AICore(%) Memory-Usage(MB) | +===============================+=================+======================================================+ | 0 310 | OK | 12.8 37 9 / 970 | | 0 0 | NA | 0 3654 / 7760 | +===============================+=================+======================================================
我的是Atlas200DK,和Atlas200是同样的芯片,但是环境本身略有不同
我这个应该也是 200 dk,盒子上写着 Atlas 200 Developer Kit
或者使用https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/63RC1alpha001/infacldevg/graphdevg/atlasag_25_0058.html export ASCEND_SLOG_PRINT_TO_STDOUT=1 直接日志输出到屏幕,看看有什么信息
还有两个很重要的问题没有问,你的硬件版本和软件版本是啥? 然后还有一个可以分析的就是ACL自己的日志, 以我的环境为例: /var/log/npu/slog/device-app-5396 这些路径下你看一下有没有什么报错
版本是这个:
$ npu-smi info +--------------------------------------------------------------------------------------------------------+ | npu-smi 22.0.4 Version: 22.0.4 | +-------------------------------+-----------------+------------------------------------------------------+ | NPU Name | Health | Power(W) Temp(C) Hugepages-Usage(page) | | Chip Device | Bus-Id | AICore(%) Memory-Usage(MB) | +===============================+=================+======================================================+ | 0 310 | OK | 12.8 37 9 / 970 | | 0 0 | NA | 0 3654 / 7760 | +===============================+=================+======================================================
我的是Atlas200DK,和Atlas200是同样的芯片,但是环境本身略有不同
我这个应该也是 200 dk,盒子上写着 Atlas 200 Developer Kit
如果你是200DK的话,那不应该装200的软件栈吧,我觉得他们是不兼容的。我感觉你应该直接从官网下那些CANN的软件和驱动安装
或者使用https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/63RC1alpha001/infacldevg/graphdevg/atlasag_25_0058.html export ASCEND_SLOG_PRINT_TO_STDOUT=1 直接日志输出到屏幕,看看有什么信息
➜ cmake-build-debug-remote-host export ASCEND_SLOG_PRINT_TO_STDOUT=1
➜ cmake-build-debug-remote-host ./acl_demo_app -c ../config/yolov5_v6_mp4_demo.json
[EVENT] PROFILING(26944,acl_demo_app):2023-03-12-14:46:15.102.107 [msprof_callback_impl.cpp:199] >>> (tid:26944) Started to register profiling ctrl callback.
[EVENT] ASCENDCL(26944,acl_demo_app):2023-03-12-14:46:15.193.780 [acl_resource_manager.cpp:78]26944 GetRuntimeV2Env: runtime v2 flag : model flag = 1, singleOp flag = 1
[EVENT] PROFILING(26944,acl_demo_app):2023-03-12-14:46:15.630.966 [msprof_callback_impl.cpp:78] >>> (tid:26944) MsprofCtrlCallback called, type: 255
[EVENT] PROFILING(26944,acl_demo_app):2023-03-12-14:46:15.631.104 [prof_acl_mgr.cpp:1190] >>> (tid:26944) Init profiling for dynamic profiling
total dev count: 1
[EVENT] TDT(26944,acl_demo_app):2023-03-12-14:46:15.635.751 [log.cpp:43][ThreadModeManager] enter into open process deviceId[0] rankSize[0],[thread_mode_manager.cpp:34:Open]26950
[EVENT] TDT(26944,acl_demo_app):2023-03-12-14:46:15.636.725 [log.cpp:43]begin load aicpu package path[/home/HwHiAiUser/Ascend/ascend-toolkit/latest/opp/Ascend310RC/aicpu/] file[Ascend310rc-aicpu_syskernels.tar.gz],[thread_mode_manager.cpp:195:HandleAICPUPackage]26950
[EVENT] TDT(26944,acl_demo_app):2023-03-12-14:46:15.636.834 [log.cpp:43]Package checkcode is [45692098],[package_worker.cpp:289:LoadAICPUPackageForThreadMode]26950
[EVENT] TDT(26944,acl_demo_app):2023-03-12-14:46:15.659.450 [log.cpp:43][ThreadModeManager] profiling callback is nullptr, skip set aicpu profiling callback,[thread_mode_manager.cpp:103:SetAICPUProfilingCallback]26950
[EVENT] CCECPU(26944,acl_demo_app):2023-03-12-14:46:15.659.734 [aicpusd_lastword.cpp:33][RegLastwordCallback][tid:26950] Reg lastword mark[aicpu sd event mng] key[0].
[EVENT] CCECPU(26944,acl_demo_app):2023-03-12-14:46:15.659.834 [aicpusd_interface_process.cpp:448][GetCurrentRunMode][tid:26950] Current aicpu mode is offline (call by api).
[EVENT] CCECPU(26944,acl_demo_app):2023-03-12-14:46:15.659.963 [aicpusd_resource_manager.cpp:237][InitBufManager][tid:26950] Aicpu schedule SetBuffCfg successed!
[EVENT] CCECPU(26944,acl_demo_app):2023-03-12-14:46:15.660.391 [aicpusd_worker.cpp:58][ThreadPool][tid:26950] ThreadPool
[EVENT] CCECPU(26944,acl_demo_app):2023-03-12-14:46:15.720.613 [aicpusd_worker.cpp:347][SetAffinity][tid:26952] aicpu bind tid by self, cpuSetFlag:[], index[1], deviceId[0], res[0].
[EVENT] CCECPU(26944,acl_demo_app):2023-03-12-14:46:15.726.154 [aicpusd_worker.cpp:347][SetAffinity][tid:26951] aicpu bind tid by self, cpuSetFlag:[], index[0], deviceId[0], res[0].
[EVENT] CCECPU(26944,acl_demo_app):2023-03-12-14:46:15.735.920 [aicpusd_worker.cpp:347][SetAffinity][tid:26955] aicpu bind tid by self, cpuSetFlag:[], index[3], deviceId[0], res[0].
[EVENT] CCECPU(26944,acl_demo_app):2023-03-12-14:46:15.746.860 [aicpusd_worker.cpp:347][SetAffinity][tid:26954] aicpu bind tid by self, cpuSetFlag:[], index[2], deviceId[0], res[0].
[EVENT] CCECPU(26944,acl_demo_app):2023-03-12-14:46:15.747.247 [aicpusd_cust_so_manager.cpp:72][InitAicpuCustSoManager][tid:26950] cust so dir name is /root/cust_aicpu_0_0_26944/.
[EVENT] TDT(26944,acl_demo_app):2023-03-12-14:46:15.753.649 [log.cpp:43][TsdClient] set profiling callback success.,[client_manager.cpp:158:SetProfilingCallback]26950
[FFMPEGInput::Init] ../video.mp4 codec name:h264
avcc profile: 100
frame h: 1080 frame w: 1920
ticks_per_frame: 2
framerate.num: 10
framerate.den: 1
ref frame num: 1
has B frame: 0
pix format: 0
codec_tag 828601953
extra_data size: 38
[DvppDecoder::Init] h: 1080 w:1920output_size: 3110400
[EVENT] CCECPU(26944,acl_demo_app):2023-03-12-14:46:15.931.746 [ae_so_manager.cc:571][CreateSingleSoMgr][tid:26951][AICPU_PROCESSER] Single so manager init failed, soFile is /root/aicpu_kernels/0/aicpu_kernels_device/libdvpp_kernels.so.
[EVENT] CCECPU(26944,acl_demo_app):2023-03-12-14:46:15.953.781 [ae_so_manager.cc:193][GetApi][tid:26951][AICPU_PROCESSER] Get api libdvpp_kernels.so from so DvppGetVersion success.
[EVENT] CCECPU(26944,acl_demo_app):2023-03-12-14:46:15.955.568 [ae_so_manager.cc:193][GetApi][tid:26955][AICPU_PROCESSER] Get api libdvpp_kernels.so from so DvppCreateVdecChannelV2 success.
[EVENT] DVPP(26944,acl_demo_app):2023-03-12-14:46:15.955.820 [DvppManager.cpp:323][API] [InitStatThread:323] [T208] Successed to create statistic thread(e7ffd00673b0).
[EVENT] DVPP(26944,acl_demo_app):2023-03-12-14:46:15.957.420 [JpegdAsyncManager.cpp:86][API] [Init:86] [T208] Successed to create jpegd async work thread(0), thread id = 255085892625328.
[EVENT] DVPP(26944,acl_demo_app):2023-03-12-14:46:15.957.658 [JpegdAsyncManager.cpp:86][API] [Init:86] [T208] Successed to create jpegd async work thread(1), thread id = 255085892490160.
[EVENT] DVPP(26944,acl_demo_app):2023-03-12-14:46:15.957.873 [JpegeAsyncManager.cpp:90][API] [Init:90] [T208] Successed to create jpege async thread(0), thread id = 255085607093168.
[EVENT] DVPP(26944,acl_demo_app):2023-03-12-14:46:15.958.088 [PngdAsyncManager.cpp:86][API] [Init:86] [T208] Successed to create pngd async thread(0), thread id = 255085606958000.
[EVENT] DVPP(26944,acl_demo_app):2023-03-12-14:46:15.958.273 [PngdAsyncManager.cpp:86][API] [Init:86] [T208] Successed to create pngd async thread(1), thread id = 255085606822832.
[EVENT] DVPP(26944,acl_demo_app):2023-03-12-14:46:15.958.462 [PngdAsyncManager.cpp:86][API] [Init:86] [T208] Successed to create pngd async thread(2), thread id = 255085606687664.
[EVENT] DVPP(26944,acl_demo_app):2023-03-12-14:46:15.958.649 [PngdAsyncManager.cpp:86][API] [Init:86] [T208] Successed to create pngd async thread(3), thread id = 255085606552496.
[EVENT] DVPP(26944,acl_demo_app):2023-03-12-14:46:15.958.828 [PngdAsyncManager.cpp:86][API] [Init:86] [T208] Successed to create pngd async thread(4), thread id = 255085606417328.
[EVENT] DVPP(26944,acl_demo_app):2023-03-12-14:46:15.959.003 [PngdAsyncManager.cpp:86][API] [Init:86] [T208] Successed to create pngd async thread(5), thread id = 255085606282160.
[EVENT] DVPP(26944,acl_demo_app):2023-03-12-14:46:15.959.398 [VpcAsyncManager.cpp:180][API] [Init:180] [T208] Successed to create vpc async work thread(255085606146992).
[EVENT] DVPP(26944,acl_demo_app):2023-03-12-14:46:15.959.706 [VpcAsyncManager.cpp:180][API] [Init:180] [T208] Successed to create vpc async work thread(255085606011824).
[EVENT] DVPP(26944,acl_demo_app):2023-03-12-14:46:15.960.003 [VpcAsyncManager.cpp:180][API] [Init:180] [T208] Successed to create vpc async work thread(255085605876656).
[EVENT] DVPP(26944,acl_demo_app):2023-03-12-14:46:15.960.306 [VpcAsyncManager.cpp:180][API] [Init:180] [T208] Successed to create vpc async work thread(255085605741488).
[EVENT] DVPP(26944,acl_demo_app):2023-03-12-14:46:15.960.819 [CmdListManager.cpp:142][CMDLIST] [Init:142] [T208] Successed to create cmdlist sync work thread(255085605606320).
[EVENT] DVPP(26944,acl_demo_app):2023-03-12-14:46:15.960.939 [CmdListManager.cpp:142][CMDLIST] [Init:142] [T208] Successed to create cmdlist sync work thread(255085605471152).
[EVENT] DVPP(26944,acl_demo_app):2023-03-12-14:46:15.961.049 [CmdListManager.cpp:142][CMDLIST] [Init:142] [T208] Successed to create cmdlist sync work thread(255085605335984).
[EVENT] DVPP(26944,acl_demo_app):2023-03-12-14:46:15.961.156 [CmdListManager.cpp:142][CMDLIST] [Init:142] [T208] Successed to create cmdlist sync work thread(255085605200816).
[1] 26944 segmentation fault ./acl_demo_app -c ../config/yolov5_v6_mp4_demo.json
➜ cmake-build-debug-remote-host
还有两个很重要的问题没有问,你的硬件版本和软件版本是啥? 然后还有一个可以分析的就是ACL自己的日志, 以我的环境为例: /var/log/npu/slog/device-app-5396 这些路径下你看一下有没有什么报错
版本是这个:
$ npu-smi info +--------------------------------------------------------------------------------------------------------+ | npu-smi 22.0.4 Version: 22.0.4 | +-------------------------------+-----------------+------------------------------------------------------+ | NPU Name | Health | Power(W) Temp(C) Hugepages-Usage(page) | | Chip Device | Bus-Id | AICore(%) Memory-Usage(MB) | +===============================+=================+======================================================+ | 0 310 | OK | 12.8 37 9 / 970 | | 0 0 | NA | 0 3654 / 7760 | +===============================+=================+======================================================
我的是Atlas200DK,和Atlas200是同样的芯片,但是环境本身略有不同
我这个应该也是 200 dk,盒子上写着 Atlas 200 Developer Kit
如果你是200DK的话,那不应该装200的软件栈吧,我觉得他们是不兼容的。我感觉你应该直接从官网下那些CANN的软件和驱动安装
奥奥,装的应该是 200 DK 的,这个表格是当时用来看版本对照的。
那你装的软件是啥版本,我之前测试的版本对应CANN的5.x左右,最新的版本没测试过
那你装的软件是啥版本,我之前测试的版本对应CANN的5.x左右,最新的版本没测试过
/ascend-toolkit
是 6.0.1:
➜ latest cat /home/HwHiAiUser/Ascend/ascend-toolkit/latest/toolkit/version.info
Version=1.84.15.1.310
version_dir=6.0.1
貌似看到了一个错误信息:
[EVENT] CCECPU(26944,acl_demo_app):2023-03-12-14:46:15.931.746 [ae_so_manager.cc:571][CreateSingleSoMgr][tid:26951][AICPU_PROCESSER] Single so manager init failed, soFile is /root/aicpu_kernels/0/aicpu_kernels_device/libdvpp_kernels.so.
貌似看到了一个错误信息:
[EVENT] CCECPU(26944,acl_demo_app):2023-03-12-14:46:15.931.746 [ae_so_manager.cc:571][CreateSingleSoMgr][tid:26951][AICPU_PROCESSER] Single so manager init failed, soFile is /root/aicpu_kernels/0/aicpu_kernels_device/libdvpp_kernels.so.
这个文件貌似真的不存在:
➜ latest file /root/aicpu_kernels/0/aicpu_kernels_device/libdvpp_kernels.so
/root/aicpu_kernels/0/aicpu_kernels_device/libdvpp_kernels.so: cannot open `/root/aicpu_kernels/0/aicpu_kernels_device/libdvpp_kernels.so' (No such file or directory)
➜ latest cd /root/aicpu_kernels/0/aicpu_kernels_device/
➜ aicpu_kernels_device ls
libaicpu_kernels.so libcpu_kernels.so libpt_kernels.so libtf_kernels.so sand_box version.info
https://github.com/lenLRX/Atlas_ACL_E2E_Demo/blob/master/run.sh#L8 那你看一下,这个环境变量,对应你的环境上应该配置到什么路径上
HwHiAiUser@davinci-mini:~$ cat /usr/local/Ascend/nnrt/set_env.sh export LD_LIBRARY_PATH=/var/davinci/driver/lib64:/var/davinci/driver/lib64/common:/var/davinci/driver/lib64/driver:$LD_LIBRARY_PATH export LD_LIBRARY_PATH=/usr/local/Ascend/nnrt/latest/lib64:$LD_LIBRARY_PATH export PYTHONPATH=/usr/local/Ascend/nnrt/latest/python/site-packages:$PYTHONPATH export ASCEND_AICPU_PATH=/usr/local/Ascend/nnrt/latest export ASCEND_OPP_PATH=/usr/local/Ascend/nnrt/latest/opp
是根据 https://www.hiascend.com/document/detail/zh/Atlas200DKDeveloperKit/1013/environment/atlased_04_0017.html 这个创建的合设环境,貌似没有单独安装 nnrt:
➜ cmake-build-debug-remote-host cat /usr/local/Ascend/nnrt/set_env.sh
cat: /usr/local/Ascend/nnrt/set_env.sh: No such file or director
不过 toolkit 下有这个 set_env.sh:
➜ cmake-build-debug-remote-host cat /home/HwHiAiUser/Ascend/ascend-toolkit/set_env.sh
export LD_LIBRARY_PATH=/var/davinci/driver/lib64:/var/davinci/driver/lib64/common:/var/davinci/driver/lib64/driver:$LD_LIBRARY_PATH
export ASCEND_TOOLKIT_HOME=/home/HwHiAiUser/Ascend/ascend-toolkit/latest
export LD_LIBRARY_PATH=${ASCEND_TOOLKIT_HOME}/lib64:${ASCEND_TOOLKIT_HOME}/lib64/plugin/opskernel:${ASCEND_TOOLKIT_HOME}/lib64/plugin/nnengine:$LD_LIBRARY_PATH
export PYTHONPATH=${ASCEND_TOOLKIT_HOME}/python/site-packages:${ASCEND_TOOLKIT_HOME}/opp/built-in/op_impl/ai_core/tbe:$PYTHONPATH
export PATH=${ASCEND_TOOLKIT_HOME}/bin:${ASCEND_TOOLKIT_HOME}/compiler/ccec_compiler/bin:$PATH
export ASCEND_AICPU_PATH=${ASCEND_TOOLKIT_HOME}
export ASCEND_OPP_PATH=${ASCEND_TOOLKIT_HOME}/opp
export TOOLCHAIN_HOME=${ASCEND_TOOLKIT_HOME}/toolkit
export ASCEND_HOME_PATH=${ASCEND_TOOLKIT_HOME}
grep 后为:
➜ cmake-build-debug-remote-host env | grep ASCEND_AICPU_PATH
ASCEND_AICPU_PATH=/home/HwHiAiUser/Ascend/ascend-toolkit/latest
我改下 run.sh 试下。。。(不过貌似之前直接命令行中已经 source 过这个 set_env.sh 了)
仍然是 falut。
通过 export ASCEND_GLOBAL_LOG_LEVEL=1
查看的详细日志为:
verbose.log
仍然是 falut。 通过
export ASCEND_GLOBAL_LOG_LEVEL=1
查看的详细日志为: verbose.log
你看看gdb现在挂哪里了,我感觉应该位置不同了
仍然是 falut。 通过
export ASCEND_GLOBAL_LOG_LEVEL=1
查看的详细日志为: verbose.log你看看gdb现在挂哪里了,我感觉应该位置不同了
貌似还是这里。。。
详细日志中,找到了几个 faild 的地方:
貌似是 op 算子相关的没有。
[WARNING] GE(28726,acl_demo_app):2023-03-12-15:07:32.009.441 [op_tiling_manager.cc:57]28726 LoadSo:Failed to dlopen /home/HwHiAiUser/Ascend/ascend-toolkit/latest/opp/built-in/op_impl/ai_core/tbe/op_master/libopmaster.so! errmsg:/home/HwHiAiUser/Ascend/ascend-toolkit/latest/opp/built-in/op_impl/ai_core/tbe/op_master/libopmaster.so: cannot open shared object file: No such file or directory
还是 op 相关的:
[WARNING] GE(28726,acl_demo_app):2023-03-12-15:07:32.215.782 [op_tiling_manager.cc:57]28726 LoadSo:Failed to dlopen /home/HwHiAiUser/Ascend/ascend-toolkit/latest/opp/op_impl/custom/ai_core/tbe/op_master/libopmaster.so! errmsg:/home/HwHiAiUser/Ascend/ascend-toolkit/latest/opp/op_impl/custom/ai_core/tbe/op_master/libopmaster.so: cannot open shared object file: No such file or directory
[WARNING] GE(28726,acl_demo_app):2023-03-12-15:07:32.215.919 [op_tiling_manager.cc:63]28726 LoadSo:Failed to dlopen /home/HwHiAiUser/Ascend/ascend-toolkit/latest/opp/op_impl/custom/ai_core/tbe/op_tiling/liboptiling.so! errmsg:/home/HwHiAiUser/Ascend/ascend-toolkit/latest/opp/op_impl/custom/ai_core/tbe/op_tiling/liboptiling.so: cannot open shared object file: No such file or directory
找不到 插件?
[WARNING] GE(28726,acl_demo_app):2023-03-12-15:07:32.238.520 [plugin_manager.cc:372]28726 Load:Failed to get realpath of /home/HwHiAiUser/Ascend/ascend-toolkit/6.0.1/aarch64-linux/lib64/plugin/engines/runtime, errmsg:No such file or directory
环境变量?
[WARNING] RUNTIME(28726,acl_demo_app):2023-03-12-15:07:32.431.216 [config.cc:638] 28726 ReadHeterogenousModeFromConfigIni: read ASCEND_LATEST_INSTALL_PATH failed! isHeterogenou=0.
profilng mode
[WARNING] TDT(28726,acl_demo_app):2023-03-12-15:07:32.436.833 [log.cpp:35]Get env[AICPU_PROFILING_MODE] failed,[internal_api.cpp:107:GetScheduleEnv]28731
cpu
[WARNING] CCECPU(28726,acl_demo_app):2023-03-12-15:07:32.742.157 [ae_so_manager.cc:109][CheckSoFile][tid:28734][AICPU_PROCESSER] Format to realpath failed:/root/aicpu_kernels/0/aicpu_kernels_device/libdvpp_kernels.so, path:/root/aicpu_kernels/0/aicpu_kernels_device/libdvpp_kernels.so
[EVENT] CCECPU(28726,acl_demo_app):2023-03-12-15:07:32.742.188 [ae_so_manager.cc:571][CreateSingleSoMgr][tid:28734][AICPU_PROCESSER] Single so manager init failed, soFile is /root/aicpu_kernels/0/aicpu_kernels_device/libdvpp_kernels.so.
我觉得问题可能还是运行时ASCEND_AICPU_PATH没有设置对
应该对了吧。。
➜ Atlas_ACL_E2E_Demo ./run.sh config/yolov5_v6_mp4_demo.json
**ASCEND_AICPU_PATH=/home/HwHiAiUser/Ascend/ascend-toolkit/latest**
total dev count: 1
[FFMPEGInput::Init] /home/HwHiAiUser/chenyue/Atlas_ACL_E2E_Demo/video.mp4 codec name:h264
avcc profile: 100
frame h: 1080 frame w: 1920
ticks_per_frame: 2
framerate.num: 10
framerate.den: 1
ref frame num: 1
has B frame: 0
pix format: 0
codec_tag 828601953
extra_data size: 38
[DvppDecoder::Init] h: 1080 w:1920output_size: 3110400
./run.sh: line 18: 30812 Segmentation fault ./build/acl_demo_app -c $
你这个环境上跑通过其他官方的用例吗
我建议你换张卡,只装制卡完了就只装nnrt试试
你这个环境上跑通过其他官方的用例吗
我建议你换张卡,只装制卡完了就只装nnrt试试
我重新搞一遍试试。。。从制卡开始吧。。
你这个环境上跑通过其他官方的用例吗
我建议你换张卡,只装制卡完了就只装nnrt试试
多谢帮忙排查,项目很有用,能学到很多东西 ~~
Update:昨天试了试官方的 yolov3 样例,最终还是 segment fault。怀疑可能是以下原因:
libmedia.so 版本不匹配:最新文档 1.0.13 上已经没有“安装 Media模块”的章节了,而 1.0.10 版本文档的跳转链接,下载的是 A200dk-npu-driver-21.0.4-ubuntu18.04-aarch64-minirc.tar.gz,对应的是 21.0.4 的,而我的 npu-smi info 输出的结果显示是 22.0.4。
官网上貌似没有 22.0.4 的 media 模块,请问你的版本分别是什么呢?
我尝试过的最后一个版本是 5.0.4.alpha003 对应驱动版本是 A200dk-npu-driver-21.0.3.1-ubuntu18.04-aarch64-minirc.tar.gz
“”安装media模块“”是不需要的步骤(对于我这个仓库的代码)
“”安装media模块“”是不需要的步骤(对于我这个仓库的代码)
我看 CMakeLists.txt 中,有一个需要 libmedia_mini.so 文件:CMakeLists.txt#L133
编译官方示例中的 YoloV3,不链接这个 so 文件,是可以跑的;链接这个 so 文件,就也会 Segment fault。
因此有以下结论:
但是还有一些疑问:
“”安装media模块“”是不需要的步骤(对于我这个仓库的代码)
我看 CMakeLists.txt 中,有一个需要 libmedia_mini.so 文件:CMakeLists.txt#L133
编译官方示例中的 YoloV3,不链接这个 so 文件,是可以跑的;链接这个 so 文件,就也会 Segment fault。
因此有以下结论:
- 我的 npu-smi info 显示版本是 22.0.4,因此可能不能安装 21.0.3 的 minirc.tar.gz 中的 libmedia_mini.so 文件。
但是还有一些疑问:
- Media 模块、libmedia_mini.so 文件,它的作用是什么呢?是只有板载树莓派式的摄像头,还是也包括资源的硬件编解码呢?
- ffmpeg 获取 rtsp 流的过程,应该不需要 libmedia_mini.so 吧。
libmedia应该是制卡的时候制卡脚本放到系统上的。
谢谢解答,我试试修改 CMakeLists.txt 去掉 libmedia_mini.so 试试,不过得重新制卡了,因为之前的内存卡经常性地无法启动。。。。得经常制卡。。
谢谢解答,我试试修改 CMakeLists.txt 去掉 libmedia_mini.so 试试,不过得重新制卡了,因为之前的内存卡经常性地无法启动。。。。得经常制卡。。
建议买张好点的卡
楼主和hub主还在吗,我也遇到了同样的情况,CANN版本也是6.0.1和配套的nnrt,编译成功后运行也是segmentation fault。几个月过去了,请问楼主有最近发现吗
我解释过了,这个仓库没有适配H265,请使用H264
---原始邮件--- 发件人: @.> 发送时间: 2023年8月24日(周四) 下午2:02 收件人: @.>; 抄送: @.**@.>; 主题: Re: [lenLRX/Atlas_ACL_E2E_Demo] rtsp h265 摄像头,运行提示 Segmentation fault (Issue #52)
楼主和hub主还在吗,我也遇到了同样的情况,CANN版本也是6.0.1和配套的nnrt,编译成功后运行也是segmentation fault。几个月过去了,请问楼主有最近发现吗
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>
我解释过了,这个仓库没有适配H265,请使用H264 … ---原始邮件--- 发件人: @.> 发送时间: 2023年8月24日(周四) 下午2:02 收件人: @.>; 抄送: @.**@.>; 主题: Re: [lenLRX/Atlas_ACL_E2E_Demo] rtsp h265 摄像头,运行提示 Segmentation fault (Issue #52) 楼主和hub主还在吗,我也遇到了同样的情况,CANN版本也是6.0.1和配套的nnrt,编译成功后运行也是segmentation fault。几个月过去了,请问楼主有最近发现吗 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>
回复这么快,您的意思是,进行测试的视频只能是 .h264对吗。我现在进行测试的是.mp4视频
我解释过了,这个仓库没有适配H265,请使用H264 … ---原始邮件--- 发件人: @.> 发送时间: 2023年8月24日(周四) 下午2:02 收件人: @.>; 抄送: @.**@.>; 主题: Re: [lenLRX/Atlas_ACL_E2E_Demo] rtsp h265 摄像头,运行提示 Segmentation fault (Issue #52) 楼主和hub主还在吗,我也遇到了同样的情况,CANN版本也是6.0.1和配套的nnrt,编译成功后运行也是segmentation fault。几个月过去了,请问楼主有最近发现吗 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>
Hub主您好,我补充一下,没有使用RTSP推流,使用的是 .mp4视频,相应的config.json文件也已更改,CANN版本从6.0.1到6.3RC1.alpha002都试过了,CmakeList已经修改编译没问题,出错的地方都在同一个地方。求助!
我的意思是视频编码只支持H264,不支持H265,mp4是容器格式,这个是支持的,建议你先检查一下视频的编码格式。 如果你不是因为H265出问题,请重新开个issue,贴上你的硬件版本和软件版本。如果能上传视频的话最好也传一下。 这个仓库我很久没有测试了,如果有问题我会抽时间维护一下。
---原始邮件--- 发件人: @.> 发送时间: 2023年8月24日(周四) 下午2:30 收件人: @.>; 抄送: @.**@.>; 主题: Re: [lenLRX/Atlas_ACL_E2E_Demo] rtsp h265 摄像头,运行提示 Segmentation fault (Issue #52)
Hub主您好,我补充一下,没有使用RTSP推流,使用的是 .mp4视频,相应的config.json文件也已更改,CANN版本从6.0.1到6.3RC1.alpha002都试过了,CmakeList已经修改编译没问题,出错的地方都在同一个地方。求助!
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>
我的意思是视频编码只支持H264,不支持H265,mp4是容器格式,这个是支持的,建议你先检查一下视频的编码格式。 如果你不是因为H265出问题,请重新开个issue,贴上你的硬件版本和软件版本。如果能上传视频的话最好也传一下。 这个仓库我很久没有测试了,如果有问题我会抽时间维护一下。 … ---原始邮件--- 发件人: @.> 发送时间: 2023年8月24日(周四) 下午2:30 收件人: @.>; 抄送: @.**@.>; 主题: Re: [lenLRX/Atlas_ACL_E2E_Demo] rtsp h265 摄像头,运行提示 Segmentation fault (Issue #52) Hub主您好,我补充一下,没有使用RTSP推流,使用的是 .mp4视频,相应的config.json文件也已更改,CANN版本从6.0.1到6.3RC1.alpha002都试过了,CmakeList已经修改编译没问题,出错的地方都在同一个地方。求助! — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>
好的Hub主,我检查一下,感谢你的回复
运行:
报错:
其中
config/yolov5_v6_rtsp_demo.json
文件是 fork 的config/yolov5_v6_demo.json
,只是更改了 src 字段,完整内容如下:由于这个 rtsp 流是 hevc / h265 的,因此源码中将
src/ffmpeg_input.cpp
中的h264_mp4toannexb
改为了hevc_mp4toannexb
:其中,模型文件
./model/yolov5s_v6.om
也是刚在同一台机器上参考教程刚转换出来的,都是使用的 root 账号,应该都有权限。