Failed when extracting point features !

cpymaple commented 4 months ago

Hi, Thanks for your great work!

I have successfully compile and run AirVO in the docker environment. But I still want to configure the dependencies in my ubuntu. I compile the AirVO in my ubuntu without any error. However, when I run the system, the terminal show "Failed when extracting point features !" fail

Here are my dependencies: OpenCV 4.2, Eigen 3.3.9, Ceres 2.0.0, G2O (20230223), TensorRT 8.4.1.5.(Tar file), CUDA 11.6, cudnn.8.5(Tar file), python3.8, onnx, Ros noetic, boost, glog, nvidia-driver-535. And I also export the following codes in my ~/.bashrc: source ~/catkin_airvo/devel/setup.bash; export LD_LIBRARY_PATH=/home/cpy/TensorRT-8.4.1.5/lib:$LD_LIBRARY_PATH; export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH.

By the way, in order to find the .h file in TensorRT (/home/cpy/TensorRT-8.4.1.5) and successfully compile AirVO, I add include_directories( /home/cpy/TensorRT-8.4.1.5/include); and link_directories(/home/cpy/TensorRT-8.4.1.5/lib) in CMakeList. But when I use the docker env, I do not need to revise the CMakeList.

Is the problem may related to my TensorRT and nvidia-driver? SInce the system can run and only cannot extract features, maybe the superpoint do not successfully operate? But for all dependencies, I follow the official tutorial, it should be correct. And I also follow other similar issues in this repo, but those does not work for me.

Looking forward to your kind reply~

cpymaple commented 4 months ago

Update: 1. My computer uses the RTX965M GPU. 2.I found that after successfully compile AirVO under docker env, I only delete build/ devel/ folder. I notice that I also need to delete the .engine file. After that, When I firstly run the AirVO, the system shows the following output:

process[rosout-1]: started with pid [22195]
started core service [/rosout]
process[air_vo-2]: started with pid [22202]
config_file = /home/cpy/catkin_airvo/src/AirVO/configs/configs_euroc.yaml
path = /home/cpy/Datasets/MH_01_easy/cam0/data
[07/03/2024-22:45:50] [I] [TRT] [MemUsageChange] Init CUDA: CPU +197, GPU +0, now: CPU 224, GPU 590 (MiB)
[07/03/2024-22:45:52] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +6, GPU +1, now: CPU 249, GPU 591 (MiB)
[07/03/2024-22:45:52] [I] [TRT] ----------------------------------------------------------------
[07/03/2024-22:45:52] [I] [TRT] Input filename:   /home/cpy/catkin_airvo/src/AirVO/output/superpoint_v1_sim_int32.onnx
[07/03/2024-22:45:52] [I] [TRT] ONNX IR version:  0.0.8
[07/03/2024-22:45:52] [I] [TRT] Opset version:    12
[07/03/2024-22:45:52] [I] [TRT] Producer name:    onnx-typecast
[07/03/2024-22:45:52] [I] [TRT] Producer version: 
[07/03/2024-22:45:52] [I] [TRT] Domain:           
[07/03/2024-22:45:52] [I] [TRT] Model version:    0
[07/03/2024-22:45:52] [I] [TRT] Doc string:       
[07/03/2024-22:45:52] [I] [TRT] ----------------------------------------------------------------
[07/03/2024-22:45:52] [W] [TRT] onnx2trt_utils.cpp:369: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[07/03/2024-22:45:52] [W] [TRT] FP16 support requested on hardware without native FP16 support, performance will be negatively affected.
[07/03/2024-22:46:01] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +259, GPU +100, now: CPU 516, GPU 691 (MiB)
[07/03/2024-22:46:02] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +81, GPU +10, now: CPU 597, GPU 701 (MiB)
[07/03/2024-22:46:02] [I] [TRT] Local timing cache in use. Profiling results in this builder pass will not be stored.
[07/03/2024-22:46:13] [W] [TRT] Weights [name=Conv_0 + Relu_1.weight] had the following issues when converted to FP16:
[07/03/2024-22:46:13] [W] [TRT]  - Subnormal FP16 values detected. 
[07/03/2024-22:46:13] [W] [TRT] If this is not the desired behavior, please modify the weights or retrain with regularization to reduce the magnitude of the weights.
[07/03/2024-22:46:13] [I] [TRT] Some tactics do not have sufficient workspace memory to run. Increasing workspace size will enable more tactics, please check verbose output for requested sizes.
[07/03/2024-22:46:13] [W] [TRT] Weights [name=Conv_0 + Relu_1.weight] had the following issues when converted to FP16:
[07/03/2024-22:46:13] [W] [TRT]  - Subnormal FP16 values detected. 
[07/03/2024-22:46:13] [W] [TRT] If this is not the desired behavior, please modify the weights or retrain with regularization to reduce the magnitude of the weights.
[07/03/2024-22:46:16] [W] [TRT] Weights [name=Conv_2 + Relu_3.weight] had the following issues when converted to FP16:
[07/03/2024-22:46:16] [W] [TRT]  - Subnormal FP16 values detected. 
[07/03/2024-22:46:16] [W] [TRT]  - Values less than smallest positive FP16 Subnormal value detected. Converting to FP16 minimum subnormalized value. 
[07/03/2024-22:46:16] [W] [TRT] If this is not the desired behavior, please modify the weights or retrain with regularization to reduce the magnitude of the weights.
[07/03/2024-22:46:16] [W] [TRT] Weights [name=Conv_2 + Relu_3.weight] had the following issues when converted to FP16:
[07/03/2024-22:46:16] [W] [TRT]  - Subnormal FP16 values detected. 
[07/03/2024-22:46:16] [W] [TRT]  - Values less than smallest positive FP16 Subnormal value detected. Converting to FP16 minimum subnormalized value. 
[07/03/2024-22:46:16] [W] [TRT] If this is not the desired behavior, please modify the weights or retrain with regularization to reduce the magnitude of the weights.
[07/03/2024-22:46:17] [W] [TRT] Weights [name=Conv_5 + Relu_6.weight] had the following issues when converted to FP16:
[07/03/2024-22:46:17] [W] [TRT]  - Subnormal FP16 values detected. 
[07/03/2024-22:46:17] [W] [TRT]  - Values less than smallest positive FP16 Subnormal value detected. Converting to FP16 minimum subnormalized value. 
[07/03/2024-22:46:17] [W] [TRT] If this is not the desired behavior, please modify the weights or retrain with regularization to reduce the magnitude of the weights.
[07/03/2024-22:46:17] [W] [TRT] Weights [name=Conv_5 + Relu_6.weight] had the following issues when converted to FP16:
[07/03/2024-22:46:17] [W] [TRT]  - Subnormal FP16 values detected. 
[07/03/2024-22:46:17] [W] [TRT]  - Values less than smallest positive FP16 Subnormal value detected. Converting to FP16 minimum subnormalized value. 
[07/03/2024-22:46:17] [W] [TRT] If this is not the desired behavior, please modify the weights or retrain with regularization to reduce the magnitude of the weights.
[07/03/2024-22:46:17] [W] [TRT] Weights [name=Conv_7 + Relu_8.weight] had the following issues when converted to FP16:
[07/03/2024-22:46:17] [W] [TRT]  - Subnormal FP16 values detected. 
[07/03/2024-22:46:17] [W] [TRT]  - Values less than smallest positive FP16 Subnormal value detected. Converting to FP16 minimum subnormalized value. 
[07/03/2024-22:46:17] [W] [TRT] If this is not the desired behavior, please modify the weights or retrain with regularization to reduce the magnitude of the weights.
[07/03/2024-22:46:18] [W] [TRT] Weights [name=Conv_10 + Relu_11.weight] had the following issues when converted to FP16:
[07/03/2024-22:46:18] [W] [TRT]  - Subnormal FP16 values detected. 
[07/03/2024-22:46:18] [W] [TRT]  - Values less than smallest positive FP16 Subnormal value detected. Converting to FP16 minimum subnormalized value. 
[07/03/2024-22:46:18] [W] [TRT] If this is not the desired behavior, please modify the weights or retrain with regularization to reduce the magnitude of the weights.
[07/03/2024-22:46:18] [W] [TRT] Weights [name=Conv_10 + Relu_11.weight] had the following issues when converted to FP16:
[07/03/2024-22:46:18] [W] [TRT]  - Subnormal FP16 values detected. 
[07/03/2024-22:46:18] [W] [TRT]  - Values less than smallest positive FP16 Subnormal value detected. Converting to FP16 minimum subnormalized value. 
[07/03/2024-22:46:18] [W] [TRT] If this is not the desired behavior, please modify the weights or retrain with regularization to reduce the magnitude of the weights.
[07/03/2024-22:46:18] [W] [TRT] Weights [name=Conv_12 + Relu_13.weight] had the following issues when converted to FP16:
[07/03/2024-22:46:18] [W] [TRT]  - Subnormal FP16 values detected. 
[07/03/2024-22:46:18] [W] [TRT]  - Values less than smallest positive FP16 Subnormal value detected. Converting to FP16 minimum subnormalized value. 
[07/03/2024-22:46:18] [W] [TRT] If this is not the desired behavior, please modify the weights or retrain with regularization to reduce the magnitude of the weights.
[07/03/2024-22:46:19] [W] [TRT] Weights [name=Conv_12 + Relu_13.weight] had the following issues when converted to FP16:
[07/03/2024-22:46:19] [W] [TRT]  - Subnormal FP16 values detected. 
[07/03/2024-22:46:19] [W] [TRT]  - Values less than smallest positive FP16 Subnormal value detected. Converting to FP16 minimum subnormalized value. 
[07/03/2024-22:46:19] [W] [TRT] If this is not the desired behavior, please modify the weights or retrain with regularization to reduce the magnitude of the weights.
[07/03/2024-22:46:19] [W] [TRT] Weights [name=Conv_15 + Relu_16.weight] had the following issues when converted to FP16:
[07/03/2024-22:46:19] [W] [TRT]  - Subnormal FP16 values detected. 
[07/03/2024-22:46:19] [W] [TRT]  - Values less than smallest positive FP16 Subnormal value detected. Converting to FP16 minimum subnormalized value. 
[07/03/2024-22:46:19] [W] [TRT] If this is not the desired behavior, please modify the weights or retrain with regularization to reduce the magnitude of the weights.
[07/03/2024-22:46:19] [W] [TRT] Weights [name=Conv_15 + Relu_16.weight] had the following issues when converted to FP16:
[07/03/2024-22:46:19] [W] [TRT]  - Subnormal FP16 values detected. 
[07/03/2024-22:46:19] [W] [TRT]  - Values less than smallest positive FP16 Subnormal value detected. Converting to FP16 minimum subnormalized value. 
[07/03/2024-22:46:19] [W] [TRT] If this is not the desired behavior, please modify the weights or retrain with regularization to reduce the magnitude of the weights.
[07/03/2024-22:46:19] [W] [TRT] Weights [name=Conv_17 + Relu_18.weight] had the following issues when converted to FP16:
[07/03/2024-22:46:19] [W] [TRT]  - Subnormal FP16 values detected. 
[07/03/2024-22:46:19] [W] [TRT]  - Values less than smallest positive FP16 Subnormal value detected. Converting to FP16 minimum subnormalized value. 
[07/03/2024-22:46:19] [W] [TRT] If this is not the desired behavior, please modify the weights or retrain with regularization to reduce the magnitude of the weights.
[07/03/2024-22:46:20] [W] [TRT] Weights [name=Conv_19 + Relu_20 || Conv_102 + Relu_103.weight] had the following issues when converted to FP16:
[07/03/2024-22:46:20] [W] [TRT]  - Subnormal FP16 values detected. 
[07/03/2024-22:46:20] [W] [TRT]  - Values less than smallest positive FP16 Subnormal value detected. Converting to FP16 minimum subnormalized value. 
[07/03/2024-22:46:20] [W] [TRT] If this is not the desired behavior, please modify the weights or retrain with regularization to reduce the magnitude of the weights.
[07/03/2024-22:46:20] [W] [TRT] Weights [name=Conv_19 + Relu_20 || Conv_102 + Relu_103.weight] had the following issues when converted to FP16:
[07/03/2024-22:46:20] [W] [TRT]  - Subnormal FP16 values detected. 
[07/03/2024-22:46:20] [W] [TRT]  - Values less than smallest positive FP16 Subnormal value detected. Converting to FP16 minimum subnormalized value. 
[07/03/2024-22:46:20] [W] [TRT] If this is not the desired behavior, please modify the weights or retrain with regularization to reduce the magnitude of the weights.
[07/03/2024-22:46:20] [W] [TRT] Weights [name=Conv_21.weight] had the following issues when converted to FP16:
[07/03/2024-22:46:20] [W] [TRT]  - Subnormal FP16 values detected. 
[07/03/2024-22:46:20] [W] [TRT] If this is not the desired behavior, please modify the weights or retrain with regularization to reduce the magnitude of the weights.
[07/03/2024-22:46:20] [W] [TRT] Weights [name=Conv_21.weight] had the following issues when converted to FP16:
[07/03/2024-22:46:20] [W] [TRT]  - Subnormal FP16 values detected. 
[07/03/2024-22:46:20] [W] [TRT] If this is not the desired behavior, please modify the weights or retrain with regularization to reduce the magnitude of the weights.
[07/03/2024-22:46:20] [W] [TRT] Weights [name=Conv_21.weight] had the following issues when converted to FP16:
[07/03/2024-22:46:20] [W] [TRT]  - Subnormal FP16 values detected. 
[07/03/2024-22:46:20] [W] [TRT] If this is not the desired behavior, please modify the weights or retrain with regularization to reduce the magnitude of the weights.
[07/03/2024-22:46:20] [W] [TRT] Myelin graph with multiple dynamic values may have poor performance if they differ. Dynamic values are:

Then I reboot the computer and run AirVO, the system only shows without any pose results:

process[rosout-1]: started with pid [6172]
started core service [/rosout]
process[air_vo_ros-2]: started with pid [6179]
process[rviz-3]: started with pid [6180]
config_file = /home/cpy/catkin_airvo/src/AirVO/configs/configs_euroc.yaml
[ INFO] [1720019086.759979875]: rviz version 1.14.25
[ INFO] [1720019086.760046604]: compiled against Qt version 5.12.8
[ INFO] [1720019086.760069928]: compiled against OGRE version 1.9.0 (Ghadamon)
[ INFO] [1720019086.772428315]: Forcing OpenGl version 0.
[07/03/2024-23:04:47] [I] [TRT] [MemUsageChange] Init CUDA: CPU +197, GPU +0, now: CPU 230, GPU 584 (MiB)
[07/03/2024-23:04:47] [I] [TRT] Loaded engine size: 7 MiB
[ INFO] [1720019087.302081927]: Stereo is NOT SUPPORTED
[ INFO] [1720019087.302183216]: OpenGL device: NVIDIA GeForce GTX 965M/PCIe/SSE2
[ INFO] [1720019087.302214682]: OpenGl version: 4.6 (GLSL 4.6).
[07/03/2024-23:04:47] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +339, GPU +187, now: CPU 589, GPU 778 (MiB)
[07/03/2024-23:04:47] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +20, now: CPU 0, GPU 20 (MiB)

At the third time, the systems seems can run in real time with 100-140 ms/per frame and output pose results.

However, I still have some problems:

1.Could you please tell me why I can successfully run AirVO in one time under the docker environment, but in my own system environment I need to run three times to get it to work? Is it related to the loading onnx or generating .engine files?

2.Meanwhile, looking at the log messages which show memory errors and other issues, is this related to the low configuration of my computer?

3.By the way, the tutorial shows the ceres 2.0.0 is necessary, but I do not find the AirVO use ceres to perform optimization. Do we not actually need ceres, just g2o for optimization?

xukuanHIT commented 4 months ago

@cpymaple Hi,

The ".engine" file is generated based on your development environment, so each time you change environments, you need to delete and regenerate it. Besides, generating this file takes some time, so please be patient. Once it is generated, you can proceed to run it.
They are outputs of TensorRT. If you can run successfully finally, you can ignore them.
We meet some build errors when the versions of the two libraries do not match, which is similar to this issue. I think it can also work if Ceres is not installed on your computer.

cpymaple commented 4 months ago

@cpymaple Hi,

1. The ".engine"  file is generated based on your development environment, so each time you change environments, you need to delete and regenerate it. Besides, generating this file takes some time, so please be patient. Once it is generated, you can proceed to run it.

2. They are outputs of TensorRT. If you can run successfully finally, you can ignore them.

3. We meet some build errors when the versions of the two libraries do not match, which is similar to [this issue](https://github.com/RainerKuemmerle/g2o/issues/477). I think it can also work if Ceres is not installed on your computer.

Hi, for the questions 3., I also encounter the mentioned problems. Thanks for your kind reply.

sair-lab / AirSLAM

Failed when extracting point features ! #112