Open skprot opened 1 year ago
Sorry, currently libspconv.so
does not work in environments smaller than sm_80.
@hopef
When I export-scn from a "non ptq model" and try to load it using load_engine_from_onnx
in https://github.com/NVIDIA-AI-IOT/Lidar_AI_Solution/blob/87fb0cc6fcf38d0cf998bf0cdcbd039e6732d928/CUDA-BEVFusion/src/bevfusion/lidar-scn.cpp#L38C1-L39C1
I get the error
[libprotobuf FATAL /usr/include/google/protobuf/repeated_field.h:1506] CHECK failed: (index) < (current_size_):
terminate called after throwing an instance of 'google::protobuf::FatalException'
what(): CHECK failed: (index) < (current_size_):
Sharing my Onnx model What could be the issue?
Versions
libprotoc 3.6.1
@hopef When I export-scn from a "non ptq model" and try to load it using
load_engine_from_onnx
in https://github.com/NVIDIA-AI-IOT/Lidar_AI_Solution/blob/87fb0cc6fcf38d0cf998bf0cdcbd039e6732d928/CUDA-BEVFusion/src/bevfusion/lidar-scn.cpp#L38C1-L39C1I get the error
[libprotobuf FATAL /usr/include/google/protobuf/repeated_field.h:1506] CHECK failed: (index) < (current_size_): terminate called after throwing an instance of 'google::protobuf::FatalException' what(): CHECK failed: (index) < (current_size_):
Sharing my Onnx model What could be the issue?
Versions
libprotoc 3.6.1
Hi sandeepnmenon,
I've committed libspconv-1.1.0, which open-sources the libprotobuf part of the parsing code. For your error, you can use it for debugging.
Hi sandeepnmenon,
I can't see the bias of the SparseConvolution layer in your onnx. This may be the root cause.
Hi sandeepnmenon,
I can't see the bias of the SparseConvolution layer in your onnx. This may be the root cause.
Thank you. I was using the lidar scn module from the checkpoint that did not go through the quantization code. When I passed my lidar model throught the quantization code, then the bias term comes and it is working. I think the quantisation code which is replacing the spconv modules with the custom classes is important for the libspconv module. Is that correct?
Also regarding this github issue. How to build to libspconv.so library? This repo only has the headers.
Hi @hopef
When I export after loading the state dict of the model and run the exptool this is being caused. But if I run it through the quantisation module (ptq.py) where it replaces the spconv module with the custom modules, then it works.
Is the libspconv library tied to the SparseConvolution classes in the quantisation code?
First, the libspconv can support the SparseConvolution without bias. Second, the bias error is introduced by the onnx parser. You can handle it in code.
Is the libspconv library tied to the SparseConvolution classes in the quantisation code? -> So, there is no correlation between them.
@hopef Hi, how to handle the bias error in code?
Thank you in advance
The latest version(v1.1.0) can handle the bias error.
@hopef
Hi, I have replaced libspconv
with libspconv-1.1.0
in this line and this line.
After that, when I run bash tool/run.sh
, below error raises
/home/sensor_fusion/Lidar_AI_Solution/CUDA-BEVFusion_mmdet3d/src/bevfusion/lidar-scn.cpp:67:15: error: ‘DTensor’ in namespace ‘spconv’ does not name a type; did you mean ‘ITensor’?
67 | spconv::DTensor *native_scn_output_ = nullptr; // TODO: DTensor
| ^~~~~~~
| ITensor
/home/sensor_fusion/Lidar_AI_Solution/CUDA-BEVFusion_mmdet3d/src/bevfusion/lidar-scn.cpp: In member function ‘bool bevfusion::lidar::SCNImplement::init(const bevfusion::lidar::SCNParameter&)’:
/home/sensor_fusion/Lidar_AI_Solution/CUDA-BEVFusion_mmdet3d/src/bevfusion/lidar-scn.cpp:43:31: error: ‘load_engine_from_onnx’ is not a member of ‘spconv’
43 | native_scn_ = spconv::load_engine_from_onnx(param_.model, static_cast<spconv::Precision>(param_.precision));
| ^~~~~~~~~~~~~~~~~~~~~
/home/sensor_fusion/Lidar_AI_Solution/CUDA-BEVFusion_mmdet3d/src/bevfusion/lidar-scn.cpp: In member function ‘virtual const nvtype::half* bevfusion::lidar::SCNImplement::forward(const nvtype::half*, unsigned int, void*)’:
/home/sensor_fusion/Lidar_AI_Solution/CUDA-BEVFusion_mmdet3d/src/bevfusion/lidar-scn.cpp:51:9: error: ‘native_scn_output_’ was not declared in this scope
51 | native_scn_output_ = native_scn_->forward(
| ^~~~~~~~~~~~~~~~~~
/home/sensor_fusion/Lidar_AI_Solution/CUDA-BEVFusion_mmdet3d/src/bevfusion/lidar-scn.cpp: In member function ‘virtual std::vector<long int> bevfusion::lidar::SCNImplement::shape()’:
/home/sensor_fusion/Lidar_AI_Solution/CUDA-BEVFusion_mmdet3d/src/bevfusion/lidar-scn.cpp:60:16: error: ‘native_scn_output_’ was not declared in this scope
60 | return native_scn_output_ == nullptr ? std::vector<int64_t>() : native_scn_output_->features_shape();
| ^~~~~~~~~~~~~~~~~~
make[2]: *** [CMakeFiles/bevfusion_core.dir/build.make:3418: CMakeFiles/bevfusion_core.dir/src/bevfusion/lidar-scn.cpp.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [CMakeFiles/Makefile2:87: CMakeFiles/bevfusion_core.dir/all] Error 2
make: *** [Makefile:91: all] Error 2
How can I apply libspconv with latest version?
In addition, when I use original libspconv.so
in this repo, assertion error related to weight_scales
occurs (this issue).
Can you assume the reason?
@sangjinpark97 Due to the interface update in version 1.1, a few code changes were required to adapt to the new version. You can take a look at the test code here.
Greetings,
I cannot find 1.1.1 in the branch page. Is it a special version?
Trying to make SM_75 work, but stopped at /usr/bin/ld: libbevfusion_core.so: undefined reference to
spconv::load_engine_from_onnx`
[ 40%] Building NVCC (Device) object CMakeFiles/bevfusion_core.dir/src/bevfusion/bevfusion_core_generated_head-transbbox.cu.o
[ 45%] Building NVCC (Device) object CMakeFiles/bevfusion_core.dir/src/bevfusion/bevfusion_core_generated_transfusion.cu.o
[ 54%] Building CXX object CMakeFiles/bevfusion_core.dir/src/common/tensorrt.cpp.o
[ 59%] Building CXX object CMakeFiles/bevfusion_core.dir/src/bevfusion/lidar-scn.cpp.o
[ 59%] Building CXX object CMakeFiles/bevfusion_core.dir/src/bevfusion/bevfusion.cpp.o
[ 63%] Linking CXX shared library libbevfusion_core.so
[ 63%] Built target bevfusion_core
[ 68%] Building NVCC (Device) object CMakeFiles/bevfusion.dir/src/common/bevfusion_generated_visualize.cu.o
[ 72%] Building NVCC (Device) object CMakeFiles/bevfusion.dir/__/libraries/cuOSD/src/bevfusion_generated_cuosd_kernel.cu.o
[ 77%] Building CXX object CMakeFiles/bevfusion.dir/src/main.cpp.o
[ 86%] Building CXX object CMakeFiles/bevfusion.dir/workspace/libraries/cuOSD/src/textbackend/backend.cpp.o
[ 86%] Building CXX object CMakeFiles/bevfusion.dir/workspace/libraries/cuOSD/src/textbackend/pango-cairo.cpp.o
[ 90%] Building CXX object CMakeFiles/bevfusion.dir/workspace/libraries/cuOSD/src/textbackend/stb.cpp.o
[ 95%] Building CXX object CMakeFiles/bevfusion.dir/workspace/libraries/cuOSD/src/cuosd.cpp.o
[100%] Linking CXX executable bevfusion
/usr/bin/ld: libbevfusion_core.so: undefined reference to `spconv::load_engine_from_onnx(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, spconv::Precision)'
collect2: error: ld returned 1 exit status
make[2]: *** [CMakeFiles/bevfusion.dir/build.make:186: bevfusion] Error 1
make[1]: *** [CMakeFiles/Makefile2:111: CMakeFiles/bevfusion.dir/all] Error 2
make: *** [Makefile:91: all] Error 2
root@4790ca7df0b6:/workspace/CUDA-BEVFusion#
@hopef Hi, I wanna compile the libspconv in win10, but I don't find the source code of libspconv, its not support win10,right?
@hopef When I export-scn from a "non ptq model" and try to load it using
load_engine_from_onnx
in https://github.com/NVIDIA-AI-IOT/Lidar_AI_Solution/blob/87fb0cc6fcf38d0cf998bf0cdcbd039e6732d928/CUDA-BEVFusion/src/bevfusion/lidar-scn.cpp#L38C1-L39C1 I get the error[libprotobuf FATAL /usr/include/google/protobuf/repeated_field.h:1506] CHECK failed: (index) < (current_size_): terminate called after throwing an instance of 'google::protobuf::FatalException' what(): CHECK failed: (index) < (current_size_):
Sharing my Onnx model What could be the issue? Versions
libprotoc 3.6.1
Hi sandeepnmenon,
I've committed libspconv-1.1.0, which open-sources the libprotobuf part of the parsing code. For your error, you can use it for debugging.
I had a same problem, and git clone the lastest repo, but I cannot find libspconv-1.1.1 in 3DSparseConvolution folder.
Hello Guys, I have created an opensource version of 3DSparseConvolution using SPCONV as base. Theoretically, it supports SM < 80, but have not tested it. https://github.com/riteshkhrn/Lidar_AI_Solution/tree/main/libraries/New3DSparseConvolution
check it out! :)
Thanks for your amazing job!
I'm wondering how to obtain libspconv.so for different platforms especially for sm_75 and etc. It seems that libspconv.so in your repo was pre-built with some differences such as
spconv::load_engine_from_onnx
. So I cannot just build libspconv.so from original spconv repo and replace it here. How I can build your modified libspconv.so?