Open rjwb1 opened 1 year ago
You reversed VOXEL_SIZE_X and VOXEL_SIZE_Y in the definition of simplify_preprocess
Defined in _simplifieronnx.py as:
def simplify_preprocess(onnx_model, VOXEL_SIZE_Y, VOXEL_SIZE_X, MAX_POINTS_PER_VOXEL):
Called in exporter.py with:
onnx_final = simplify_preprocess(onnx_simp, VOXEL_SIZE_X, VOXEL_SIZE_Y, MAX_POINTS_PER_VOXEL)
@GuillaumeAnoufa I will update this, I was rushing and didn't realise as my cloud is square
@GuillaumeAnoufa I have fixed this. I reversed it twice so it actually should not of affected the final model but better to have correctly named variables/args...
Hi, I have used this commit to successfully export my model to onnx format. However, when I perform predictions using TensorRT , I am seeing different results when compared to when I just do eval on the trained pth file. More about my issue can be found in https://github.com/NVIDIA-AI-IOT/CUDA-PointPillars/issues/82. Please let me know if I am missing something.
Hi @Allamrahul have you verified the pointcloud information is being loaded correctly and in the right order?
Could you further elaborate if possible? What do you mean by right order? I was able to use my custom data, train the model to detect a single object, validate the results using the demo.py file: The boxes look right on the eval set and the results look really good. Post that, I tried to export but I realized that everything in the export script was hardcoded for 3 classes. I then referred your PR, made those changes, and thankfully, they unblocked me and allowed me to export the model. I later moved the generated params.h to include folder and .onnx file to model folder and followed the instrcutions in https://github.com/NVIDIA-AI-IOT/CUDA-PointPillars, under compile and run.
If the point cloud information was not being loaded correctly, I think my results on eval set would have been terrible. I compared the exporter.py, the file responsible for exporting to onnx and demo.py script, the one which performs the eval and helps me visualize predictions on my eval set: both process the data in the same manner.
I am using the following command for export:
I have also changed line 157 in main.py to let me predict on .npy files instead of .bin file
If you need further information to guide in the right direction, please let me know.
import glob
import onnx
import torch
import argparse
import numpy as np
from pathlib import Path
from onnxsim import simplify
from pcdet.utils import common_utils
from pcdet.models import build_network
from pcdet.datasets import DatasetTemplate
from pcdet.config import cfg, cfg_from_yaml_file
from exporter_paramters import export_paramters as export_paramters
from simplifier_onnx import simplify_preprocess, simplify_postprocess
class DemoDataset(DatasetTemplate):
def __init__(self, dataset_cfg, class_names, training=True, root_path=None, logger=None, ext='.bin'):
"""
Args:
root_path:
dataset_cfg:
class_names:
training:
logger:
"""
super().__init__(
dataset_cfg=dataset_cfg, class_names=class_names, training=training, root_path=root_path, logger=logger
)
self.root_path = root_path
self.ext = ext
data_file_list = glob.glob(str(root_path / f'*{self.ext}')) if self.root_path.is_dir() else [self.root_path]
data_file_list.sort()
self.sample_file_list = data_file_list
def __len__(self):
return len(self.sample_file_list)
def __getitem__(self, index):
if self.ext == '.bin':
points = np.fromfile(self.sample_file_list[index], dtype=np.float32).reshape(-1, 4)
elif self.ext == '.npy':
points = np.load(self.sample_file_list[index])
else:
raise NotImplementedError
input_dict = {
'points': points,
'frame_id': index,
}
data_dict = self.prepare_data(data_dict=input_dict)
return data_dict
def parse_config():
parser = argparse.ArgumentParser(description='arg parser')
parser.add_argument('--cfg_file', type=str, default='cfgs/kitti_models/pointpillar.yaml',
help='specify the config for demo')
parser.add_argument('--data_path', type=str, default='demo_data',
help='specify the point cloud data file or directory')
parser.add_argument('--ckpt', type=str, default=None, help='specify the pretrained model')
parser.add_argument('--ext', type=str, default='.bin', help='specify the extension of your point cloud data file')
args = parser.parse_args()
cfg_from_yaml_file(args.cfg_file, cfg)
return args, cfg
def main():
args, cfg = parse_config()
export_paramters(cfg)
logger = common_utils.create_logger()
logger.info('------ Convert OpenPCDet model for TensorRT ------')
demo_dataset = DemoDataset(
dataset_cfg=cfg.DATA_CONFIG, class_names=cfg.CLASS_NAMES, training=False,
root_path=Path(args.data_path), ext=args.ext, logger=logger
)
model = build_network(model_cfg=cfg.MODEL, num_class=len(cfg.CLASS_NAMES), dataset=demo_dataset)
model.load_params_from_file(filename=args.ckpt, logger=logger, to_cpu=True)
model.cuda()
model.eval()
np.set_printoptions(threshold=np.inf)
with torch.no_grad():
MAX_VOXELS = 10000
NUMBER_OF_CLASSES = len(cfg.CLASS_NAMES)
MAX_POINTS_PER_VOXEL = None
DATA_PROCESSOR = cfg.DATA_CONFIG.DATA_PROCESSOR
POINT_CLOUD_RANGE = cfg.DATA_CONFIG.POINT_CLOUD_RANGE
for i in DATA_PROCESSOR:
if i['NAME'] == "transform_points_to_voxels":
MAX_POINTS_PER_VOXEL = i['MAX_POINTS_PER_VOXEL']
VOXEL_SIZES = i['VOXEL_SIZE']
break
if MAX_POINTS_PER_VOXEL == None:
logger.info('Could Not Parse Config... Exiting')
import sys
sys.exit()
VOXEL_SIZE_X = abs(POINT_CLOUD_RANGE[0] - POINT_CLOUD_RANGE[3]) / VOXEL_SIZES[0]
VOXEL_SIZE_Y = abs(POINT_CLOUD_RANGE[1] - POINT_CLOUD_RANGE[4]) / VOXEL_SIZES[1]
FEATURE_SIZE_X = VOXEL_SIZE_X / 2 # Is this number of bins?
FEATURE_SIZE_Y = VOXEL_SIZE_Y / 2
dummy_voxels = torch.zeros(
(MAX_VOXELS, MAX_POINTS_PER_VOXEL, 4),
dtype=torch.float32,
device='cuda:0')
dummy_voxel_idxs = torch.zeros(
(MAX_VOXELS, 4),
dtype=torch.int32,
device='cuda:0')
dummy_voxel_num = torch.zeros(
(1),
dtype=torch.int32,
device='cuda:0')
dummy_input = dict()
dummy_input['voxels'] = dummy_voxels
dummy_input['voxel_num_points'] = dummy_voxel_num
dummy_input['voxel_coords'] = dummy_voxel_idxs
dummy_input['batch_size'] = torch.tensor(1)
torch.onnx.export(model, # model being run
dummy_input, # model input (or a tuple for multiple inputs)
"./pointpillar_raw.onnx", # where to save the model (can be a file or file-like object)
export_params=True, # store the trained parameter weights inside the model file
opset_version=11, # the ONNX version to export the model to
do_constant_folding=True, # whether to execute constant folding for optimization
keep_initializers_as_inputs=True,
input_names = ['voxels', 'voxel_num', 'voxel_idxs'], # the model's input names
output_names = ['cls_preds', 'box_preds', 'dir_cls_preds'], # the model's output names
)
onnx_raw = onnx.load("./pointpillar_raw.onnx") # load onnx model
onnx_trim_post = simplify_postprocess(onnx_raw, FEATURE_SIZE_X, FEATURE_SIZE_Y, NUMBER_OF_CLASSES)
onnx_simp, check = simplify(onnx_trim_post)
assert check, "Simplified ONNX model could not be validated"
onnx_final = simplify_preprocess(onnx_simp, VOXEL_SIZE_X, VOXEL_SIZE_Y, MAX_POINTS_PER_VOXEL)
onnx.save(onnx_final, "pointpillar.onnx")
print('finished exporting onnx')
logger.info('[PASS] ONNX EXPORTED.')
if __name__ == '__main__':
main()
import onnx
import numpy as np
import onnx_graphsurgeon as gs
@gs.Graph.register()
def replace_with_clip(self, inputs, outputs, voxel_array):
for inp in inputs:
inp.outputs.clear()
for out in outputs:
out.inputs.clear()
op_attrs = dict()
op_attrs["dense_shape"] = voxel_array
return self.layer(name="PPScatter_0", op="PPScatterPlugin", inputs=inputs, outputs=outputs, attrs=op_attrs)
def loop_node(graph, current_node, loop_time=0):
for i in range(loop_time):
next_node = [node for node in graph.nodes if len(node.inputs) != 0 and len(current_node.outputs) != 0 and node.inputs[0] == current_node.outputs[0]][0]
current_node = next_node
return next_node
def simplify_postprocess(onnx_model, FEATURE_SIZE_X, FEATURE_SIZE_Y, NUMBER_OF_CLASSES):
print("Use onnx_graphsurgeon to adjust postprocessing part in the onnx...")
graph = gs.import_onnx(onnx_model)
cls_preds = gs.Variable(name="cls_preds", dtype=np.float32, shape=(1, int(FEATURE_SIZE_Y), int(FEATURE_SIZE_X), 2 * NUMBER_OF_CLASSES * NUMBER_OF_CLASSES))
box_preds = gs.Variable(name="box_preds", dtype=np.float32, shape=(1, int(FEATURE_SIZE_Y), int(FEATURE_SIZE_X), 14 * NUMBER_OF_CLASSES))
dir_cls_preds = gs.Variable(name="dir_cls_preds", dtype=np.float32, shape=(1, int(FEATURE_SIZE_Y), int(FEATURE_SIZE_X), 4 * NUMBER_OF_CLASSES))
tmap = graph.tensors()
new_inputs = [tmap["voxels"], tmap["voxel_idxs"], tmap["voxel_num"]]
new_outputs = [cls_preds, box_preds, dir_cls_preds]
for inp in graph.inputs:
if inp not in new_inputs:
inp.outputs.clear()
for out in graph.outputs:
out.inputs.clear()
first_ConvTranspose_node = [node for node in graph.nodes if node.op == "ConvTranspose"][0]
concat_node = loop_node(graph, first_ConvTranspose_node, 3)
assert concat_node.op == "Concat"
first_node_after_concat = [node for node in graph.nodes if len(node.inputs) != 0 and len(concat_node.outputs) != 0 and node.inputs[0] == concat_node.outputs[0]]
for i in range(3):
transpose_node = loop_node(graph, first_node_after_concat[i], 1)
assert transpose_node.op == "Transpose"
transpose_node.outputs = [new_outputs[i]]
graph.inputs = new_inputs
graph.outputs = new_outputs
graph.cleanup().toposort()
return gs.export_onnx(graph)
def simplify_preprocess(onnx_model, VOXEL_SIZE_X, VOXEL_SIZE_Y, MAX_POINTS_PER_VOXEL):
print("Use onnx_graphsurgeon to modify onnx...")
graph = gs.import_onnx(onnx_model)
tmap = graph.tensors()
MAX_VOXELS = tmap["voxels"].shape[0]
VOXEL_ARRAY = np.array([int(VOXEL_SIZE_X), int(VOXEL_SIZE_Y)])
input_new = gs.Variable(name="voxels", dtype=np.float32, shape=(MAX_VOXELS, MAX_POINTS_PER_VOXEL, 10))
X = gs.Variable(name="voxel_idxs", dtype=np.int32, shape=(MAX_VOXELS, 4))
Y = gs.Variable(name="voxel_num", dtype=np.int32, shape=(1,))
first_node_after_pillarscatter = [node for node in graph.nodes if node.op == "Conv"][0]
first_node_pillarvfe = [node for node in graph.nodes if node.op == "MatMul"][0]
next_node = current_node = first_node_pillarvfe
for i in range(6):
next_node = [node for node in graph.nodes if node.inputs[0] == current_node.outputs[0]][0]
if i == 5: # ReduceMax
current_node.attrs['keepdims'] = [0]
break
current_node = next_node
last_node_pillarvfe = current_node
graph.inputs.append(Y)
inputs = [last_node_pillarvfe.outputs[0], X, Y]
outputs = [first_node_after_pillarscatter.inputs[0]]
graph.replace_with_clip(inputs, outputs, VOXEL_ARRAY)
graph.cleanup().toposort()
graph.inputs = [first_node_pillarvfe.inputs[0] , X, Y]
graph.outputs = [tmap["cls_preds"], tmap["box_preds"], tmap["dir_cls_preds"]]
graph.cleanup()
graph.inputs = [input_new, X, Y]
first_add = [node for node in graph.nodes if node.op == "MatMul"][0]
first_add.inputs[0] = input_new
graph.cleanup().toposort()
return gs.export_onnx(graph)
if __name__ == '__main__':
mode_file = "pointpillar-native-sim.onnx"
simplify_preprocess(onnx.load(mode_file))
Hi @Allamrahul have you verified the pointcloud information is being loaded correctly and in the right order?
By this, do you mean how main.py is loading the .npy file? The script is meant for .bin files but it should work for .npy files as well. Please let me know if I am missing something.
@GuillaumeAnoufa I have fixed this. I reversed it twice so it actually should not of affected the final model but better to have correctly named variables/args...
Hi, I used this commit but when I compared my results using the pth file Vs TRT inference, my predictions matched in box sizes, z dimension and confidence but not in X and Y coordinates. I tweaked the code the following way: In exporter.py, I kept the following line unchanged: onnx_final = simplify_preprocess(onnx_simp, VOXEL_SIZE_X, VOXEL_SIZE_Y, MAX_POINTS_PER_VOXEL) But in simplifier_onnx.py, I swapped the order: def simplify_preprocess(onnx_model, VOXEL_SIZE_Y, VOXEL_SIZE_X, MAX_POINTS_PER_VOXEL) and made VOXEL_ARRAY = np.array([int(VOXEL_SIZE_X), int(VOXEL_SIZE_Y)]).
This is atleast allowing me to get the same results across both eval using pth and using the onnx file for TRT inference. Not sure why this is working. That being said, I am getting slightly lesser number of predictions when I make predictions using TF-RT. Not sure why this is. Would really like some help in to understand if what I am doing is right.
Hi, this is the same as my original commit before the suggestion was made by @GuillaumeAnoufa to change it. I guess I was right all along as I inspected the model in netron. I'll revert the commit suggested by @GuillaumeAnoufa
Hi, I tried your 1st commit but that's not working: The following is the analysis:
In your 1st iteration:
Call: X, Y Fn def: Y, X VOXEL_ARRAY: Y, X
conclusion: Call's X maps to VOXEL_ARRAY[0] Call's Y maps to VOXEL_ARRAY[1]
2nd iteration: (according to commit suggested by @GuillaumeAnoufa):
call: X, Y Fn def: X, Y, VOXEL_ARRAY: X, Y
conclusion: Call's X maps to VOXEL_ARRAY[0] Call's Y maps to VOXEL_ARRAY[1]
what works for me: call: X, Y Fn def: Y, X VOXEL_ARRAY: X, Y
conclusion: Call's X maps to VOXEL_ARRAY[1] Call's Y maps to VOXEL_ARRAY[0]
I have just retried iteration 1 and 2 again and they don't solve the issue because, inherently they both are doing the same thing. Mapping gets reversed if I try the way I suggested. Could you confirm this?
That seems right, in my implementation by voxel shape is (600,600) so I would not notice this issue. I will fix this as soon as I can
One more question: the boxes I get during TFRT inference are just a subset of the boxes I get during evaluation phase using the pth file. For example, for a .npy file, during eval phase, if I get 4 bounding boxes, I am getting 1 or 2 or 3 during TFRT inference and the output number changes every time I run it. Any way to get all the detections during TFRT inference?
@Allamrahul are you using the same score and NMS threshold? I guess I would start by adjusting these in Params.h. I haven't directly compared my PyTorch results to the ones from tensorrt but they seem the same for me.
I just removed that entirely as I require very fast performance. I also implemented a better way of loading params from a yaml file exporter.py generates if you'd be interested.
For guidance in my Params.h I find that a score threshold of 0.3-0.4 and an NMS thresh of 0.01 works well.
Will check that. Additionally, when I enable fp16, I am getting 100's of bounding boxes ( in the range of 5 to 350) during TFRT inference. When I disable fp 16, recompile and run, the number of detections are back to normal.
Let me know the right way of doing it and if I am missing something here.
This worked for me. Obviously FP16 can incur a accuracy penalty
Could you specify what worked for you? Its not clear from your comment. Thanks. Also, currently, my score threshold is 0.25 and NMS thresh of 0.01 in params.h. I am just using the params.h the exporter.py generates during onnx model generation
I mean FP16 worked normally for me when commenting the lines you suggested above
Perhaps try a score_thesh of 0.4
By normally, you mean you too are getting hundreds of detections? Sorry, I dont have much experience in deployment and this is the first time I am dealing with fp16.
No worries, I meant that I did not observes having hundreds of detections with FP16 but my confidence is set to 0.3. Perhaps look at the detections you are getting and if you are receiving lots of low scores increase the threshold.
Got it, let me check that.
Also, one more thing: I am using .npy files since I am using a custom dataset. I observed that there is an 32 byte offset when I load the same npy file via python, numpy VS when I load it through cpp. Could this be a factor?
I am using ROS so I do not have to load any files so can't fully recommend a solution. However I do write the binary files I use for training with OpenPCDet. The only think I could recommend is trying to making the dtype of the numpy array you are using np.float16. Although I seem to get good results in my implementation without explicitly using float16 when I convert from the ROS msg.
@rjwb1 , could you point me to the exact TFRT inference files you are using at the moment? As mentioned before, my fp16 numbers are out of whack, 300 detections in some case and 5 in other. I am expecting it to give a number between 3 and 5 for every point cloud. I would like to cross reference the exact commit or group of commits you are using for inference just to make sure I am not missing anything of importance. After analyzing the results, I found out that the model is too confident on some examples, giving out a confidence values of 90 to 100 % in a lot of the detection. But on some examples, its giving the right output.
@rjwb1 , in regards to https://github.com/NVIDIA-AI-IOT/CUDA-PointPillars/issues/85, I see that MAX_VOXELS is hard coded to 10000 in the export script exporter.py. But when I examine the pointpillar.yaml, I see this: MAX_NUMBER_OF_VOXELS: { 'train': 16000, 'test': 40000 } So, should'nt MAX_VOXELS in the export script be 40000?
I tried this out: when I gave 40000 for MAX_VOXELS and exported the onnx file, the multiple false positives I get in FP16 inference TFRT goes away. Can anyone confirm what I did makes sense?
Correctly applies params from the model cfg to the onnx exporter
Hi,I can export My custom onnx model, but the results seem innocent,can you give me some help
Hello, thank you for your work on the custom model conversion. I found that the dense shape
of PPScatter_0
in the model converted with your code is reversed compared to the official ONNX model.
But, currently, everything is working fine after making a modification to swap the positions of VOXEL_SIZE_Y
and VOXEL_SIZE_X
in the below section:
In simplifier_onnx.py
,line 83
as:
before modification:
VOXEL_ARRAY = np.array([int(VOXEL_SIZE_X),int(VOXEL_SIZE_Y)])
after modification:
VOXEL_ARRAY = np.array([int(VOXEL_SIZE_Y),int(VOXEL_SIZE_X)])
Hope this helps to solve the problem!
@Allamrahul hello , have u solved the trt mismatch problem? I also meet this problem , could u tell me how to solve it? thanks for your guidance.
Hello everyone and thanks @rjwb1 for the amazing updates,
I can successfully trained my custom data with 3 classes like KITTI (vehicle, pedestrian, cyclist) in OpenPCDet and results looked fine on python side. Then i converted the model with exporter.py and also ran it succesfully in my c++ code.
Then i trained same custom data with 12 classes (i seperated the vehicle class like bus, van ,truck) and results also looked fine on python side but after the exporting the model with exporter.py the results on c++ side was completely random and produced lots of large false detections.
Has anyone encountered a problem like this before? or trained with different class sizes before ? Could there be something class dependent parameter in exporter.py?
I would be glad if anyone can help. Thank you.
I found out that the "MAX_POINTS_PER_VOXEL" parameter in the pointpillar.yaml file is the problem. When I change the parameter from the default 32 to something different, it causes the problem I described above. I am looking for solution.
hi,great job. So u mean the reason is MAX_POINTS_PER_VOXEL changed?if u set it as 32,the c++ size results will be same with your python side results? it bothers me a lot
---Original--- From: @.> Date: Wed, Aug 9, 2023 21:18 PM To: @.>; Cc: "Zhentao @.**@.>; Subject: Re: [NVIDIA-AI-IOT/CUDA-PointPillars] Exporter Custom Models Fix (PR #77)
I found out that the "MAX_POINTSPER" parameter in the pointpillar.yaml file is the problem. When I change the parameter from the default 32 to something different, it causes the problem I described above. I am looking for solution.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>
hi @zzt007,
The pointcloud in my dataset is very dense in close range so i set the MAX_POINTS_PER_VOXEL parameter to 128. But after i trained my data with that parameter and export with this functions, the boundingbox results were completely random in c++ side. Then i started training with default MAX_POINTS_PER_VOXEL:32 parameter and tested after couple of epochs the model started to detect the objects in correct boundingbox sizes.
I'am still in early stages in training and optimizing parameters, but as soon as i get proper results i will compare the results.
Hi guys, I had to make some small changes as I work in a different private repository so I haven't fully tested everything. For my application I use a single class however I have tried with multiple. And I also use a custom voxel size and number (XYZ) and this works for me. I'm not at my computer right now but when I return I'd be happy to help 👍🏼
Just to confirm you're correctly copying the Params.h header over? In my version I generate a config file that does not need to be rebuilt but I haven't done this here
@Acuno41 I have discovered that the MAX_POINTS_PER_VOXEL is also hardcoded in the kernel.h. Did you change it here?
It sounds not bad , sincerely looking forward to your results and reply. thanks
------------------ 原始邮件 ------------------ 发件人: "NVIDIA-AI-IOT/CUDA-PointPillars" @.>; 发送时间: 2023年8月9日(星期三) 晚上10:42 @.>; @.**@.>; 主题: Re: [NVIDIA-AI-IOT/CUDA-PointPillars] Exporter Custom Models Fix (PR #77)
hi @zzt007,
The pointcloud in my dataset is very dense in close range so i set the MAX_POINTS_PER_VOXEL parameter to 128. But after i trained my data with that parameter and export with this functions, the boundingbox results were completely random in c++ side. Then i started training with default MAX_POINTS_PER_VOXEL:32 parameter and tested after couple of epochs the model started to detect the objects in correct boundingbox sizes.
I'am still in early stages in training and optimizing parameters, but as soon as i get proper results i will compare the results.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
Hi @rjwb1, thanks for the response,
Just to confirm you're correctly copying the Params.h header over? In my version I generate a config file that does not need to be rebuilt but I haven't done this here
Yes, i correctly copied the params.h to the c++ side and checked if they loaded correctly to the c++ code
@Acuno41 I have discovered that the MAX_POINTS_PER_VOXEL is also hard-coded in the kernel.h. Did you change it here?
And also i updated the kernel.h little bit to remove the hardcoded param.h dependent parameters, kernel.h looks like below in my code
const int THREADS_FOR_VOXEL = 256; // threads number for a block
const int POINTS_PER_VOXEL = Params::max_num_points_per_pillar; // depands on "params.h"
const int WARP_SIZE = 32; // one warp(32 threads) for one pillar
const int WARPS_PER_BLOCK = 4; // four warp for one block
const int FEATURES_SIZE = 10; // features maps number depands on "params.h"
const int PILLARS_PER_BLOCK = 64; // one thread deals with one pillar and a block has PILLARS_PER_BLOCK threads
const int PILLAR_FEATURE_SIZE = Params::num_feature_scatter; // feature count for one pillar depands on "params.h"
And i changed max_num_points_per_pillar and num_feature_scatter to static const in params.h.
Considering that the MAX_POINTS_PER_VOXEL parameter is used in the preprocess part, I suspect something there might be causing the problem while preparing the data to feed to model.
Hi @rjwb1
I also follow your forked repository and this PR #77 , But it does not shows same result compared to my pytorch(*.pth) inference
Here's my overall procedure!
1. Train my custom model with custom dataset
2. Convert my custom model *.pth
into *.onnx
with exporter.py
3. Change include/param.h
param.h
in step 25. Modify Hard-coded value in kernel.h
6. build and infer
Here's result of pytorch+ros inference
Also, here's result of CUDA-PointPillars
Can you give me some advice?
Also, have you wrapped this package into ROS?
I also had the same problem at the cuda version, and I need to work with ROS too , if u have any ways or ideas to solve it , please contact with me . thanks a lot.
---Original--- From: @.> Date: Wed, Aug 30, 2023 00:20 AM To: @.>; Cc: "Zhentao @.**@.>; Subject: Re: [NVIDIA-AI-IOT/CUDA-PointPillars] Exporter Custom Models Fix (PR #77)
Hi @rjwb1
I also follow your forked repository and this PR #77 , But it does not shows same result compared to my pytorch(*.pth) inference
Here's my overall procedure!
INPUT_RANGE: [-80, -80, -10, 80, 80, 10] (Square!)
VOXEL_SIZE: [0.4, 0.4, 20]
Convert my custom model .pth into .onnx with exporter.py
Change include/param.h
Apply newly created file param.h in step 2
POINTS_PER_VOXEL
visualize with open3d
Here's result of pytorch+ros inference
Also, here's result of CUDA-PointPillars
Can you give me some advice?
Also, have you wrapped this package into ROS?
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
Correctly applies params from the model cfg to the onnx exporter