sair-lab / AirSLAM

🚀 AirVO upgrades to AirSLAM 🚀
GNU General Public License v3.0
773 stars 109 forks source link

启动之后没有任何反应 #127

Closed silence-moon closed 1 month ago

silence-moon commented 3 months ago

系统启动起来之后,无图像也没有建图的输出。若是改变模型路径,则会提示找不到模型,说明应该是在运行,但是具体卡在哪里了,不太清楚 QQ20240816-115209

xukuanHIT commented 3 months ago

@silence-moon 输出信息提示你的这个配置文件没找到,你可以检查下是否存在

silence-moon commented 3 months ago

不好意思,是下面这一张 QQ20240816-133815

silence-moon commented 3 months ago

哦,看到输出了,不支持双目。请问我需要怎么修改呢?

xukuanHIT commented 3 months ago

支持双目。第一次运行会根据.onnx生成.enginew文件,这个步骤比较耗时,需要等几分钟完成模型转换

silence-moon commented 3 months ago

然后,刚刚突然多了很多输出,就又报错了 图片

silence-moon commented 3 months ago

当我再次启动之后,输出是这样子的 图片

silence-moon commented 3 months ago

嗯,这里的launch文件,我按照readme中mapping的说明做了设置,,saving_dir也是存在的

xukuanHIT commented 3 months ago

@silence-moon 你测试 环境是怎样的?Cuda和TenrorRT版本是多少呢?

silence-moon commented 3 months ago

我的cuda版本是11.4,tensorrt的版本是TensorRT-8.6.0.12.Linux.x86_64-gnu.cuda-11.8.tar.gz

silence-moon commented 3 months ago

我准备先在命令行中直接转换engine文件,但是有部分文件执行错误了,部分文件转换成功了。 总的执行结果图是这样的: 图片

错误的plnet模型报错如下: 图片

silence-moon commented 3 months ago

它可以使用我转换好的模型吗?刚才我抓换了之后,启动roslaunch的时候,代码内部又执行了一遍,但是我还没有在代码中找到进行转换的代码。如果是代码中执行的话,转换会报错,如下: [E] [TRT] 3: [runtime.cpp::~Runtime::346] Error Code 3: API Usage Error (Parameter check failed at: runtime/rt/runtime.cpp::~Runtime::346, condition: mE ngineCounter.use_count() == 1. Destroying a runtime before destroying deserialized engines created by the runtime leads to undefined behavior.
) 是不是因为,模型没有转换成功,所以后面才报错: [visual_odometry-2] process has died [pid 3339556, exit code -11, 然后,执行不下去的呢?

xukuanHIT commented 3 months ago

@silence-moon 理论上是可以使用转好的模型的,但你好像没有把plnet的模型转好。你可以先用docker试试?

silence-moon commented 3 months ago

我下载了和您一样的tensorrt版本,但是直接运行起来还是报错,以下是转换日志:

[08/19/2024-11:52:26] [W] [TRT] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. [08/19/2024-11:52:26] [W] [TRT] onnx2trt_utils.cpp:400: One or more weights outside the range of INT32 was clamped [08/19/2024-11:52:26] [W] [TRT] Tensor DataType is determined at build time for tensors not marked as input or output. [08/19/2024-11:52:26] [W] [TRT] Tensor DataType is determined at build time for tensors not marked as input or output. [08/19/2024-11:52:26] [I] Finished parsing network model. Parse time: 0.0948283 [08/19/2024-11:52:26] [W] Dynamic dimensions required for input: input, but no shapes were provided. Automatically overriding shape to: 1x1x1x1 [08/19/2024-11:52:26] [I] [TRT] BuilderFlag::kTF32 is set but hardware does not support TF32. Disabling TF32. [08/19/2024-11:52:26] [E] Error[4]: [graphShapeAnalyzer.cpp::processCheck::862] Error Code 4: Internal Error (/TopK: length of reduction axis (0) is smaller than K (300)) [08/19/2024-11:52:26] [E] Engine could not be created from network [08/19/2024-11:52:26] [E] Building engine failed [08/19/2024-11:52:26] [E] Failed to create engine from model or file. [08/19/2024-11:52:26] [E] Engine set up failed &&&& FAILED TensorRT.trtexec [TensorRT v8601] # /home/silence/software/TensorRT-8.6.1.6/bin/trtexec --onnx=plnet_s0.onnx --saveEngine=plnet_s0.engine

最后一行是转换的命令。麻烦您帮忙看一下,谢谢!

xukuanHIT commented 3 months ago

我们用的Cuda12.1,我不确定你这个没转成功有没有cuda版本不同的因素在里头。并且,TensorRT 8.6能否与你的Cuda11.4兼容(参考这个link)?你能跑同TensorRT给的一些demo吗?

silence-moon commented 3 months ago

之前我把yolov5转换成tensorrt模型进行了测试,所以跑通应该所没有问题的

xukuanHIT commented 3 months ago

@yuefanhao 越凡,你可以看下这个问题吗?

yuefanhao commented 3 months ago

@silence-moon @xukuanHIT 我看了下问题记录,模型转换的过程中我看到这行报错[08/19/2024-11:52:26] [E] Error[4]: [graphShapeAnalyzer.cpp::processCheck::862] Error Code 4: Internal Error (/TopK: length of reduction axis (0) is smaller than K (300)),手动模型转换的时候指定动态尺寸了吗?但如果使用我们提供的模型也无法成功运行的话,建议还是在我们提供的docker里面尝试下,有可能是cuda版本差异导致的。

silence-moon commented 3 months ago

没有指定动态尺寸。我转换的时候命令是这样的:trtexec --onnx=.onnx --saveEngine=.engine,请问需要加哪些参数啊?

yuefanhao commented 3 months ago

@silence-moon 可以参考plnet.cpp里面的这两块的代码: https://github.com/sair-lab/AirSLAM/blob/b50b261c9992f68ea9af1f0592fda5d0ac92b458/src/plnet.cpp#L73 https://github.com/sair-lab/AirSLAM/blob/b50b261c9992f68ea9af1f0592fda5d0ac92b458/src/plnet.cpp#L130 指定了哪些输入是动态的; 对于使用trtexec进行动态输入维度的转换,TensorRT官方例程里面有指定动态维度的方法,参考https://github.com/NVIDIA/TensorRT/tree/main/samples/trtexec 里面的Example 3

logic-zhang commented 3 months ago

@silence-moon @xukuanHIT 我看了下问题记录,模型转换的过程中我看到这行报错[08/19/2024-11:52:26] [E] Error[4]: [graphShapeAnalyzer.cpp::processCheck::862] Error Code 4: Internal Error (/TopK: length of reduction axis (0) is smaller than K (300)),手动模型转换的时候指定动态尺寸了吗?但如果使用我们提供的模型也无法成功运行的话,建议还是在我们提供的docker里面尝试下,有可能是cuda版本差异导致的。

HI,请问AirSLAM支持cuda11.4吗?我的电脑对应的CUDA Version: 11.4,在使用你们的docker环境运行时出现了错误: /workspace/air_slam_ws/src/AirSLAM/configs/visual_odometry/vo_euroc.yaml config done Erron lightglue building [08/25/2024-17:21:53] [W] [TRT] Unable to determine GPU memory usage [08/25/2024-17:21:53] [W] [TRT] Unable to determine GPU memory usage [08/25/2024-17:21:53] [I] [TRT] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 24, GPU 0 (MiB) [08/25/2024-17:21:53] [W] [TRT] CUDA initialization failure with error: 804. Please check your CUDA installation: http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html Error in SuperPoint building

xukuanHIT commented 3 months ago

@logic-zhang 提供的docker里面有cuda 12,对机子的nvidia driver版本有要求