PJLab-ADG / neuralsim

neuralsim: 3D surface reconstruction and simulation based on 3D neural rendering.
MIT License
582 stars 31 forks source link

code_multi train waymo dataset reproduce problem #54

Open blackmrb opened 3 months ago

blackmrb commented 3 months ago

首先感谢作者无私奉献开源了这个repo,我尝试了几个方法,目前Neuralsim是效果最好的。

我基于waymo做了18组实验(这些场景是在repo提供的81个动态场景里挑选的,有低速行驶和高速行驶的),有7组的loss不收敛。 使用的config: all_occ.with_normals.240201.yaml。 使用的segmentation模型:https://github.com/open-mmlab/mmsegmentation/tree/main/configs/mask2former

image
想请教的问题:

  1. loss不收敛问题
    1. 自车速度高时loss不收敛是什么原因?对于自车高速行驶的场景有哪些超参数需要调整?config默认的配置segment-9653249092自车是低速行驶的,从实验结果来看这组效果最好,但是我们的业务更关注高速的场景。
  2. 重建效果不好问题有哪些优化思路?(视频见下文)
    1. 车道线不清晰
    2. 空中多了一块云/水滴,可能是什么问题导致的,有哪些优化的思路?
    3. 彻底糊了的原因,见seg-364414视频
    4. 最开始几帧远处闪了几下

具体复现结果如下:

loss不收敛导致训到一半挂了(7组),自车速度在70km/h左右

    # - segment-1758724094753801109_1251_037_1271_037_with_camera_labels # 64km/h
    # - segment-3490810581309970603_11125_000_11145_000_with_camera_labels # 71km/h
    # - segment-3591015878717398163_1381_280_1401_280_with_camera_labels # 68km/h
    # - segment-4468278022208380281_455_820_475_820_with_camera_labels # 70km/h, good case
    # - segment-4537254579383578009_3820_000_3840_000_with_camera_labels # 68km/h, good case
    # - segment-10072231702153043603_5725_000_5745_000_with_camera_labels # 40 -> 70km/h,宽阔,只有前方一辆车
    # - segment-11454085070345530663_1905_000_1925_000_with_camera_labels #70km/h,good case

训练完成的(11组)

效果好的(4组)

segment-9653249092275997647_980_000_1000_000_with_camera_labels, 0, 190 # 路口,很多行人这组效果最好,是code_multi/configs/exps/fg_neus=permuto/all_occ.with_normals.240201.yaml默认使用的scene,自车低速行驶。

https://github.com/PJLab-ADG/neuralsim/assets/165770555/bbb374ed-8616-4135-be02-2fe7046373e1

119178,低速行驶,17km/h

https://github.com/PJLab-ADG/neuralsim/assets/165770555/eb5e982e-79af-48d0-a7d3-818711d8360b

365758,低速行驶,20km/h

https://github.com/PJLab-ADG/neuralsim/assets/165770555/db972f75-7284-40bd-aba7-4548ac32f769

189139,低速行驶,25km

https://github.com/PJLab-ADG/neuralsim/assets/165770555/9cf6572c-465a-4225-84c4-46eeff467286

车道线不清晰(2组)

segment-15053781258223091665_3192_117_3212_117_with_camera_labels # 20->75km/h问题:车道线不清晰 image

https://github.com/PJLab-ADG/neuralsim/assets/165770555/cfbf155e-fbaf-4220-8e2a-5248b649ff35
seg188749,整体可以,车道线不清晰, 36km/h

https://github.com/PJLab-ADG/neuralsim/assets/165770555/7c1e9089-5d2f-40ce-afd0-1f7916edefde

空中多了一块东西(3组)

segment-14369250836076988112_7249_040_7269_040_with_camera_labels # 56km/h问题:空中多了一块 image https://github.com/PJLab-ADG/neuralsim/assets/165770555/f4fb8402-2f6f-44e9-b3c5-a96bbb0ed5c0

177369,空中多了一片东西,17km/h

https://github.com/PJLab-ADG/neuralsim/assets/165770555/c91dd1e7-ae29-49ec-9ec9-5fb140a99c14

391943,低速行驶,夜晚,效果可以,空中多了一片雨滴,15km/h image

彻底糊了(1组)

364414,低速行驶,street彻底糊了,效果很差,0->50km/h

https://github.com/PJLab-ADG/neuralsim/assets/165770555/167e9e72-9352-4ae9-b21c-19eb4c6ce460

最开始几帧远处闪了几下(1组)

416406 ,0-25km https://github.com/PJLab-ADG/neuralsim/assets/165770555/ed86c126-da63-4b0c-b70b-a6cc0c9222d4

下面是7组失败的场景和一组成功的场景(seg965324,绿色)的loss对比

pixel loss

image

image

lidar loss

image

下面是各组实验的具体log记录

loss NAN

segment-1758724094753801109_1251_037_1271_037_with_camera_labels

image

segment-10072231702153043603_5725_000_5745_000_with_camera_labels

train-20240505234622385.log

segment-11454085070345530663_1905_000_1925_000_with_camera_labels

train-20240505234659815.log

segment-4537254579383578009_3820_000_3840_000_with_camera_labels

train-20240505234419432.log

2024-05-05T17:52:59.168990620Z 
 61%|██████    | 9174/15000 [1:51:42<49:52,  1.95it/s, loss_total=1.9]
2024-05-05T17:52:59.672168504Z 
 61%|██████    | 9174/15000 [1:51:42<49:52,  1.95it/s, loss_total=1.95]
2024-05-05T17:52:59.672575273Z 
 61%|██████    | 9175/15000 [1:51:43<49:34,  1.96it/s, loss_total=1.95]
2024-05-05T17:53:00.187376121Z 
 61%|██████    | 9175/15000 [1:51:43<49:34,  1.96it/s, loss_total=1.55]
2024-05-05T17:53:00.187871303Z 
 61%|██████    | 9176/15000 [1:51:43<49:41,  1.95it/s, loss_total=1.55]
2024-05-05T17:53:00.854840512Z 
 61%|██████    | 9176/15000 [1:51:43<49:41,  1.95it/s, loss_total=1.77]
2024-05-05T17:53:00.855162754Z 
 61%|██████    | 9177/15000 [1:51:44<54:12,  1.79it/s, loss_total=1.77]
2024-05-05T17:53:01.381251068Z 
 61%|██████    | 9177/15000 [1:51:44<54:12,  1.79it/s, loss_total=1.84]
2024-05-05T17:53:01.381580806Z 
 61%|██████    | 9178/15000 [1:51:45<53:16,  1.82it/s, loss_total=1.84]
2024-05-05T17:53:01.893998791Z 
 61%|██████    | 9178/15000 [1:51:45<53:16,  1.82it/s, loss_total=1.67]
2024-05-05T17:53:01.894265751Z 
 61%|██████    | 9179/15000 [1:51:45<52:12,  1.86it/s, loss_total=1.67]
2024-05-05T17:53:02.518480330Z 
 61%|██████    | 9179/15000 [1:51:45<52:12,  1.86it/s, loss_total=1.52]
2024-05-05T17:53:02.518743279Z 
 61%|██████    | 9180/15000 [1:51:46<54:42,  1.77it/s, loss_total=1.52]
2024-05-05T17:53:03.074412687Z 
 61%|██████    | 9180/15000 [1:51:46<54:42,  1.77it/s, loss_total=1.29]
2024-05-05T17:53:03.074830105Z 
 61%|██████    | 9181/15000 [1:51:46<54:27,  1.78it/s, loss_total=1.29]
2024-05-05T17:53:03.582773677Z 
 61%|██████    | 9181/15000 [1:51:46<54:27,  1.78it/s, loss_total=1.35]
2024-05-05T17:53:03.583124855Z 
 61%|██████    | 9182/15000 [1:51:47<52:54,  1.83it/s, loss_total=1.35]
2024-05-05T17:53:04.045558308Z 
 61%|██████    | 9182/15000 [1:51:47<52:54,  1.83it/s, loss_total=1.27]
2024-05-05T17:53:04.045775159Z 
 61%|██████    | 9183/15000 [1:51:47<50:29,  1.92it/s, loss_total=1.27]
2024-05-05T17:53:04.609443972Z 
 61%|██████    | 9183/15000 [1:51:47<50:29,  1.92it/s, loss_total=1.95]
2024-05-05T17:53:04.609845332Z 
 61%|██████    | 9184/15000 [1:51:48<51:43,  1.87it/s, loss_total=1.95]
2024-05-05T17:53:04.788085968Z 
 61%|██████    | 9184/15000 [1:51:48<51:43,  1.87it/s, loss_total=1.76]
 61%|██████    | 9184/15000 [1:51:48<1:10:48,  1.37it/s, loss_total=1.76]
2024-05-05T17:53:04.788116308Z 
  0%|          | 0/1 [2:07:17<?, ?it/s]
2024-05-05T17:53:04.788133860Z Error occurred in exp: logs/waymo/code_multi/fg_neus=permuto/all_occ.with_normals.24020/seg453725
2024-05-05T17:53:04.802094313Z Traceback (most recent call last):
2024-05-05T17:53:04.802125179Z   File "dataio/autonomous_driving/waymo/train_multi_and_eval_multiple.py", line 30, in <module>
2024-05-05T17:53:04.802129404Z     train_main(sce_args)
2024-05-05T17:53:04.802132444Z   File "/home/rongbo.ma/neuralsim/code_multi/tools/train.py", line 1537, in main_function
2024-05-05T17:53:04.802135738Z     raise e
2024-05-05T17:53:04.802138554Z   File "/home/rongbo.ma/neuralsim/code_multi/tools/train.py", line 1529, in main_function
2024-05-05T17:53:04.802141435Z     train_step()
2024-05-05T17:53:04.802144030Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/profile.py", line 561, in <lambda>
2024-05-05T17:53:04.802147167Z     return lambda *args, **kwargs: _ProfileWrap(fn=arg)(*args, **kwargs)
2024-05-05T17:53:04.802150041Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/profile.py", line 501, in __call__
2024-05-05T17:53:04.802153010Z     ret = self.fn(*args, **kwargs)
2024-05-05T17:53:04.802155911Z   File "/home/rongbo.ma/neuralsim/code_multi/tools/train.py", line 1345, in train_step
2024-05-05T17:53:04.802158779Z     ret, losses = trainer('pixel', sample, ground_truth, local_it, logger=logger)
2024-05-05T17:53:04.802161624Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
2024-05-05T17:53:04.802182312Z     return forward_call(*input, **kwargs)
2024-05-05T17:53:04.802185404Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/profile.py", line 561, in <lambda>
2024-05-05T17:53:04.802188365Z     return lambda *args, **kwargs: _ProfileWrap(fn=arg)(*args, **kwargs)
2024-05-05T17:53:04.802191166Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/profile.py", line 501, in __call__
2024-05-05T17:53:04.802194122Z     ret = self.fn(*args, **kwargs)
2024-05-05T17:53:04.802196831Z   File "/home/rongbo.ma/neuralsim/code_multi/tools/train.py", line 344, in forward
2024-05-05T17:53:04.802199701Z     ret, losses = self.train_step_pixel(sample, ground_truth, it, logger=logger)
2024-05-05T17:53:04.802202496Z   File "/home/rongbo.ma/neuralsim/code_multi/tools/train.py", line 500, in train_step_pixel
2024-05-05T17:53:04.802205371Z     ret = self.renderer.render(
2024-05-05T17:53:04.802208018Z   File "/home/rongbo.ma/neuralsim/app/renderers/buffer_compose_renderer.py", line 937, in render
2024-05-05T17:53:04.802211058Z     ret = self(*rays, scene=scene, observer=observer, **kwargs)
2024-05-05T17:53:04.802213869Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
2024-05-05T17:53:04.802216850Z     return forward_call(*input, **kwargs)
2024-05-05T17:53:04.802220840Z   File "/home/rongbo.ma/neuralsim/app/renderers/buffer_compose_renderer.py", line 96, in forward
2024-05-05T17:53:04.802223668Z     return self.ray_query(*args, **kwargs)
2024-05-05T17:53:04.802226364Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/profile.py", line 561, in <lambda>
2024-05-05T17:53:04.802229416Z     return lambda *args, **kwargs: _ProfileWrap(fn=arg)(*args, **kwargs)
2024-05-05T17:53:04.802232338Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/profile.py", line 501, in __call__
2024-05-05T17:53:04.802235185Z     ret = self.fn(*args, **kwargs)
2024-05-05T17:53:04.802237827Z   File "/home/rongbo.ma/neuralsim/app/renderers/buffer_compose_renderer.py", line 627, in ray_query
2024-05-05T17:53:04.802240628Z     batched_query_shared(model, group)
2024-05-05T17:53:04.802243280Z   File "/home/rongbo.ma/neuralsim/app/renderers/buffer_compose_renderer.py", line 263, in batched_query_shared
2024-05-05T17:53:04.802246252Z     raw_ret: dict = model.batched_ray_query(
2024-05-05T17:53:04.802248936Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/models/fields_conditional_dynamic/neus/renderer_mixin.py", line 288, in batched_ray_query
2024-05-05T17:53:04.802252145Z     details['accel'] = self.accel.debug_stats()
2024-05-05T17:53:04.802254880Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
2024-05-05T17:53:04.802260619Z     return func(*args, **kwargs)
2024-05-05T17:53:04.802263383Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/models/accelerations/occgrid_accel/batched_dynamic.py", line 253, in debug_stats
2024-05-05T17:53:04.802266720Z     **tensor_statistics(num_occupied_per_nonempty_ins, 'per_ins.nonempty.num_occupied', metrics=['mean', 'min', 'max', 'std']), 
2024-05-05T17:53:04.802269593Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/utils.py", line 797, in tensor_statistics
2024-05-05T17:53:04.802272508Z     return {f"{prefix}{'.' if prefix and not prefix.endswith('.') else ''}{key}": metric_fn[key](data).item() for key in metrics if key in metric_fn}
2024-05-05T17:53:04.802275665Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/utils.py", line 797, in <dictcomp>
2024-05-05T17:53:04.802278848Z     return {f"{prefix}{'.' if prefix and not prefix.endswith('.') else ''}{key}": metric_fn[key](data).item() for key in metrics if key in metric_fn}
2024-05-05T17:53:04.802282326Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/utils.py", line 785, in <lambda>
2024-05-05T17:53:04.802285378Z     "min": lambda x: x.min(),
2024-05-05T17:53:04.802288186Z RuntimeError: min(): Expected reduction dim to be specified for input.numel() == 0. Specify the reduction dim with the 'dim' argument.

train-20240505233823947.log

AssertionError: Occupancy grid becomes empty during training.

scenario_id=segment-3490810581309970603_11125_000_11145_000_with_camera_labels

34: 2024-05-05T17:40:15.446258349Z 
 95%|█████████▌| 14267/15000 [1:52:54<05:26,  2.24it/s, loss_total=0.326]
35: 2024-05-05T17:40:15.446497280Z 
 95%|█████████▌| 14268/15000 [1:52:54<05:16,  2.32it/s, loss_total=0.326]
36: 2024-05-05T17:40:15.809401574Z 
 95%|█████████▌| 14268/15000 [1:52:54<05:16,  2.32it/s, loss_total=0.26] 
37: 2024-05-05T17:40:15.809610367Z 
 95%|█████████▌| 14269/15000 [1:52:55<05:00,  2.43it/s, loss_total=0.26]
38: 2024-05-05T17:40:16.275019572Z 
 95%|█████████▌| 14269/15000 [1:52:55<05:00,  2.43it/s, loss_total=0.24]
39: 2024-05-05T17:40:16.275154072Z 
 95%|█████████▌| 14270/15000 [1:52:55<05:12,  2.34it/s, loss_total=0.24]
40: 2024-05-05T17:40:16.699921008Z 
 95%|█████████▌| 14270/15000 [1:52:55<05:12,  2.34it/s, loss_total=0.372]
41: 2024-05-05T17:40:16.700209054Z 
 95%|█████████▌| 14271/15000 [1:52:56<05:11,  2.34it/s, loss_total=0.372]
42: 2024-05-05T17:40:17.101061717Z 
 95%|█████████▌| 14271/15000 [1:52:56<05:11,  2.34it/s, loss_total=0.239]
43: 2024-05-05T17:40:17.101251008Z 
 95%|█████████▌| 14272/15000 [1:52:56<05:05,  2.39it/s, loss_total=0.239]
44: 2024-05-05T17:40:17.465482350Z 
 95%|█████████▌| 14272/15000 [1:52:56<05:05,  2.39it/s, loss_total=0.436]
 95%|█████████▌| 14272/15000 [1:52:56<05:45,  2.11it/s, loss_total=0.436]
45: 2024-05-05T17:40:17.465515750Z 
  0%|          | 0/1 [1:57:24<?, ?it/s]
46: 2024-05-05T17:40:17.465537233Z Error occurred in exp: logs/waymo/code_multi/fg_neus=permuto/all_occ.with_normals.24020/seg349081
47: 2024-05-05T17:40:17.468884262Z Traceback (most recent call last):
48: 2024-05-05T17:40:17.468894599Z   File "dataio/autonomous_driving/waymo/train_multi_and_eval_multiple.py", line 30, in <module>
49: 2024-05-05T17:40:17.468897375Z     train_main(sce_args)
50: 2024-05-05T17:40:17.468899457Z   File "/home/rongbo.ma/neuralsim/code_multi/tools/train.py", line 1537, in main_function
51: 2024-05-05T17:40:17.468901977Z     raise e
52: 2024-05-05T17:40:17.468903926Z   File "/home/rongbo.ma/neuralsim/code_multi/tools/train.py", line 1529, in main_function
53: 2024-05-05T17:40:17.468906147Z     train_step()
54: 2024-05-05T17:40:17.468908124Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/profile.py", line 561, in <lambda>
55: 2024-05-05T17:40:17.468910394Z     return lambda *args, **kwargs: _ProfileWrap(fn=arg)(*args, **kwargs)
56: 2024-05-05T17:40:17.468912465Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/profile.py", line 501, in __call__
57: 2024-05-05T17:40:17.468914765Z     ret = self.fn(*args, **kwargs)
58: 2024-05-05T17:40:17.468916721Z   File "/home/rongbo.ma/neuralsim/code_multi/tools/train.py", line 1400, in train_step
59: 2024-05-05T17:40:17.468918774Z     ret, losses = trainer('lidar', sample, ground_truth, local_it, logger=logger)
60: 2024-05-05T17:40:17.468920915Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
61: 2024-05-05T17:40:17.468923227Z     return forward_call(*input, **kwargs)
62: 2024-05-05T17:40:17.468925196Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/profile.py", line 561, in <lambda>
63: 2024-05-05T17:40:17.468927319Z     return lambda *args, **kwargs: _ProfileWrap(fn=arg)(*args, **kwargs)
64: 2024-05-05T17:40:17.468929319Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/profile.py", line 501, in __call__
65: 2024-05-05T17:40:17.468931591Z     ret = self.fn(*args, **kwargs)
66: 2024-05-05T17:40:17.468933531Z   File "/home/rongbo.ma/neuralsim/code_multi/tools/train.py", line 348, in forward
67: 2024-05-05T17:40:17.468935587Z     ret, losses = self.train_step_lidar(sample, ground_truth, it, logger=logger)
68: 2024-05-05T17:40:17.468937575Z   File "/home/rongbo.ma/neuralsim/code_multi/tools/train.py", line 765, in train_step_lidar
69: 2024-05-05T17:40:17.468939672Z     ret = self.renderer.render(
70: 2024-05-05T17:40:17.468949806Z   File "/home/rongbo.ma/neuralsim/app/renderers/buffer_compose_renderer.py", line 937, in render
71: 2024-05-05T17:40:17.468952085Z     ret = self(*rays, scene=scene, observer=observer, **kwargs)
72: 2024-05-05T17:40:17.468954154Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
73: 2024-05-05T17:40:17.468956293Z     return forward_call(*input, **kwargs)
74: 2024-05-05T17:40:17.468958702Z   File "/home/rongbo.ma/neuralsim/app/renderers/buffer_compose_renderer.py", line 96, in forward
75: 2024-05-05T17:40:17.468960834Z     return self.ray_query(*args, **kwargs)
76: 2024-05-05T17:40:17.468962792Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/profile.py", line 561, in <lambda>
77: 2024-05-05T17:40:17.468964964Z     return lambda *args, **kwargs: _ProfileWrap(fn=arg)(*args, **kwargs)
78: 2024-05-05T17:40:17.468966995Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/profile.py", line 501, in __call__
79: 2024-05-05T17:40:17.468969282Z     ret = self.fn(*args, **kwargs)
80: 2024-05-05T17:40:17.468971206Z   File "/home/rongbo.ma/neuralsim/app/renderers/buffer_compose_renderer.py", line 627, in ray_query
81: 2024-05-05T17:40:17.468973299Z     batched_query_shared(model, group)
82: 2024-05-05T17:40:17.468975245Z   File "/home/rongbo.ma/neuralsim/app/renderers/buffer_compose_renderer.py", line 259, in batched_query_shared
83: 2024-05-05T17:40:17.468977324Z     model.set_condition(batched_infos)
84: 2024-05-05T17:40:17.468979240Z   File "/home/rongbo.ma/neuralsim/app/models/shared/batched_neus.py", line 404, in set_condition
85: 2024-05-05T17:40:17.468981351Z     super().set_condition(z=z_ins_per_batch, ins_inds_per_batch=ins_inds_per_batch)
86: 2024-05-05T17:40:17.468983453Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/models/fields_conditional/neus/renderer_mixin.py", line 105, in set_condition
87: 2024-05-05T17:40:17.468985675Z     self.accel.cur_batch__step(self.it, self.query_sdf)
88: 2024-05-05T17:40:17.468987914Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
89: 2024-05-05T17:40:17.468990095Z     return func(*args, **kwargs)
90: 2024-05-05T17:40:17.468992076Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/models/accelerations/occgrid_accel/batched.py", line 149, in cur_batch__step
91: 2024-05-05T17:40:17.468994266Z     updated = self.occ.step(cur_it, val_query_fn_normalized_x_bi, within_bi=self.ins_inds_per_batch, logger=logger)
92: 2024-05-05T17:40:17.468996555Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
93: 2024-05-05T17:40:17.468998669Z     return func(*args, **kwargs)
94: 2024-05-05T17:40:17.469000564Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/models/accelerations/occgrid/ema_batched.py", line 145, in step
95: 2024-05-05T17:40:17.469002721Z     self._step(cur_it, val_query_fn_normalized_x_bi, within_bi=within_bi, 
96: 2024-05-05T17:40:17.469007031Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
97: 2024-05-05T17:40:17.469009352Z     return func(*args, **kwargs)
98: 2024-05-05T17:40:17.469011551Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/models/accelerations/occgrid/ema_batched.py", line 193, in _step
99: 2024-05-05T17:40:17.469013680Z     assert idx_nonempty.numel() > 0, "Occupancy grid becomes empty during training. Your model/algorithm/training setting might be incorrect. Please check."
100: 2024-05-05T17:40:17.469016117Z AssertionError: Occupancy grid becomes empty during training. Your model/algorithm/training setting might be incorrect. Please check
blackmrb commented 3 months ago

补充一下7组训练失败的case,每个20s的场景分成前后两段10s,重新训练的结果。 怀疑是速度太大,自车移动距离过长,导致场景太大。

    # - segment-1758724094753801109_1251_037_1271_037_with_camera_labels # 64km/h
    # - segment-3490810581309970603_11125_000_11145_000_with_camera_labels # 71km/h
    # - segment-3591015878717398163_1381_280_1401_280_with_camera_labels # 68km/h
    # - segment-4468278022208380281_455_820_475_820_with_camera_labels # 70km/h, good case
    # - segment-4537254579383578009_3820_000_3840_000_with_camera_labels # 68km/h, good case
    # - segment-10072231702153043603_5725_000_5745_000_with_camera_labels # 40 -> 70km/h,宽阔,只有前方一辆车
    # - segment-11454085070345530663_1905_000_1925_000_with_camera_labels #70km/h,good case

结论:

  1. 整体上渲染是很好的,不过空中几乎都会多出一团东西,像云一样
  2. 隧道很模糊
  3. 有两个场景不知为何没有地面

    0-10s

    远处山突然丢了,紧接着一团云闪现出来向自车移动

https://github.com/PJLab-ADG/neuralsim/assets/165770555/2731ea0e-88fd-404f-bd47-61a9224bb6c5

地面没了,训练失败

https://github.com/PJLab-ADG/neuralsim/assets/165770555/7a77adca-7690-403f-acec-8f8d1cfd15c8

https://github.com/PJLab-ADG/neuralsim/assets/165770555/cfc43d25-65be-4760-840a-8a2f5d1b3a2e

立交桥突然消失

https://github.com/PJLab-ADG/neuralsim/assets/165770555/ec0f6022-b32f-4dbe-b70f-f05d6ae992c8

10-20s

所有场景都是空中多一块,向自车冲来。 这个多一块尤为严重。

https://github.com/PJLab-ADG/neuralsim/assets/165770555/b7975d1c-6288-439b-a47d-9b77c478601e

立交桥下,非常糊。

https://github.com/PJLab-ADG/neuralsim/assets/165770555/93ab9728-c7fc-4492-809f-76e8ce38f057