I have a question about the ground-truth ego state difference used in baseline TCP and VAD. It seems that TCP uses anno['x'],anno['y'] and anno['theta'], which is obtained by GPS (does it contain noise in annotation?) while VAD uses world2lidar and does some transformation to get the current and future state of the ego car. I find the values are different so I feel confused. Could you explain a bit?
Also, is it possible to construct sensor_infos['LIDAR_TOP']['world2lidar'] = world2lidar from x, y, and theta?
@robosuite All code is written to follow this simple rule: for model inputs, it should be the noisy one to avoid train-val gap; for labels, it should be the accurate one to avoid misleading gradient signals.
I have a question about the ground-truth ego state difference used in baseline TCP and VAD. It seems that TCP uses anno['x'],anno['y'] and anno['theta'], which is obtained by GPS (does it contain noise in annotation?) while VAD uses world2lidar and does some transformation to get the current and future state of the ego car. I find the values are different so I feel confused. Could you explain a bit? Also, is it possible to construct sensor_infos['LIDAR_TOP']['world2lidar'] = world2lidar from x, y, and theta?