Open LMD0311 opened 11 months ago
Thank you for your inspiring work. I try to reproduce the results, the
avg val IoU
is26.15
, and theavg val mIoU
is16.72
, which are similar with the results from paper.However, my visualization results make me confused. During the training, the Transformer get 1st-15th frames as inputs and predict 2nd-16th frames. Here are some visualization results.
- 2nd GT:
- 3rd GT:
- 16th GT:
- 2nd Predict:
- 3rd Predict:
- 16th Predict:
The results are confusing. Even the reconstruction of 2nd and 3rd frames are not satisfying, and I cannot find any connection between them. @wzzheng Could authors provide any help?
Visualization code comes from https://github.com/wzzheng/TPVFormer/blob/main/visualization/vis_frame.py.
Since the visualized GT is quite reasonable, I guess my visualization code works fine.
BTW, the predict Occ I chose comes from pred
of
https://github.com/wzzheng/OccWorld/blob/65658b16669493cc3f428bc615112bb22aede8f9/model/TransVQVAE.py#L168
the GT Occ I chose comes from output_dict['target_occs']
of
https://github.com/wzzheng/OccWorld/blob/65658b16669493cc3f428bc615112bb22aede8f9/model/TransVQVAE.py#L137
Thank you for your inspiring work. I try to reproduce the results, the
avg val IoU
is26.15
, and theavg val mIoU
is16.72
, which are similar with the results from paper.However, my visualization results make me confused. During the training, the Transformer get 1st-15th frames as inputs and predict 2nd-16th frames. Here are some visualization results.
2nd Predict:
3rd Predict:
16th Predict:
The results are confusing. Even the reconstruction of 2nd and 3rd frames are not satisfying, and I cannot find any connection between them. @wzzheng Could authors provide any help?