NVlabs / FB-BEV

Official PyTorch implementation of FB-BEV & FB-OCC - Forward-backward view transformation for vision-centric autonomous driving perception
Other
664 stars 46 forks source link

ONNX export Error: RuntimeError: _Map_base::at #37

Open johnyang-nv opened 6 months ago

johnyang-nv commented 6 months ago

I have tried exporting the onnx file of FB-OCC, but I face the following error during tracing at the custom op of QuickCumsumCuda specifically when torch.onnx.export while the feed-forward inference of the model does not have any issue:

File "/FB-BEV/mmdet3d/ops/bev_pool_v2/bev_pool.py", line 102, in forward_dummy
    x = QuickCumsumCuda.apply(depth, feat, ranks_depth, ranks_feat, ranks_bev, bev_feat_shape, interval_starts, interval_lengths)
RuntimeError: _Map_base::at

This is how the error can be reproduced:

  1. I had isolated the custom op QuickCumsumCuda in a separate class function as showin the following for the ease of reproducibility:
    class Bev_Pool_v2(torch.nn.Module):
    def __init__(self):
        super(Bev_Pool_v2, self).__init__()
    def forward(self, depth, feat, ranks_depth, ranks_feat, ranks_bev, bev_feat_shape, interval_starts, interval_lengths):
        x = QuickCumsumCuda.apply(depth, feat, ranks_depth, ranks_feat, ranks_bev, bev_feat_shape, interval_starts, interval_lengths)
        x = x.permute(0, 4, 1, 2, 3).contiguous()
        return x
    def forward_dummy(self, data):
        depth, feat, ranks_depth, ranks_feat, ranks_bev, bev_feat_shape, interval_starts, interval_lengths = data
        x = QuickCumsumCuda.apply(depth, feat, ranks_depth, ranks_feat, ranks_bev, bev_feat_shape, interval_starts, interval_lengths)    
        x = x.permute(0, 4, 1, 2, 3).contiguous()
        return x
  2. I generate/feed-forward the random inputs, which does not yield any issue during model inference.
    
    # Random Generations of Inputs
    depth = torch.rand(1, 6, 80, 16, 44).cuda()
    feat = torch.rand(1, 6, 80, 16, 44).cuda()
    ranks_depth = torch.randint(0, 337522, (206988, )).to(dtype=torch.int32).cuda()
    ranks_feat = torch.randint(0, 4223, (206988, )).to(dtype=torch.int32).cuda()
    ranks_bev = torch.randint(0, 79972, (206988, )).to(dtype=torch.int32).cuda()
    bev_feat_shape = (1, 8, 100, 100, 80)
    interval_starts = torch.randint(0, 79972, (52815, )).to(dtype=torch.int32).cuda()
    interval_lengths = torch.randint(0, 213, (52815, )).to(dtype=torch.int32).cuda()

Define the model and the input

model = Bev_Pool_v2().eval().cuda() model.forward = model.forwarddummy input = [depth, feat, ranks_depth, ranks_feat, ranks_bev, bev_feat_shape, interval_starts, interval_lengths]

Feed the input to the model

model(input_) print('feed-forward inference is done without errors.')

3. Yet, the error mentioned above appears when exporting the model. 

with torch.nograd(): torch.onnx.export( model, input, 'bev_pool_v2_USE.onnx',

export_params=True,

    keep_initializers_as_inputs=True,
    do_constant_folding=False,
    verbose=True,
    opset_version=13
)


Despite exploring various solutions, I have yet to resolve this error.
chl916185 commented 5 months ago

How to solve it? @johnyang-nv

royalneverwin commented 2 months ago

You can find the definition of bev_pool_v2's onnx op in BEVDet repo, I used to exporting the onnx file of BEVDepth and it works.