Open PaulSudarshan opened 8 months ago
Hi @PaulSudarshan,
You can visualize your predictions using the aiMotive Dataset Loader repository.
When you obtain the predicted bounding boxes after inference, you can visualize them using the example renderer: PYTHONPATH=$PYTHONPATH: python examples/example_render.py --root-dir PATH_TO_AIMOTIVE_DATA --split val
You need to update the getitem() method of /src/data_loader.py:
def __getitem__(self, path: str) -> DataItem:
"""
Returns sensor data for a given keyframe.
Args:
path: path of the keyframe's annotation file
Returns:
a DataItem with annotations and sensor data
"""
data_folder = self.get_directory(path)
frame_id = self.get_frame_id(path)
### THIS LINE IS ADDED TO THE ORIGINAL CODE FOR VISUALIZING PREDICTIONS.
path = "YOUR_BASE_PATH/EXPERIMENT_NAME/outputs/val" + path.split('val')[1]
####
annotations = Annotation(path)
lidar_data = load_lidar_data(data_folder, frame_id)
radar_data = load_radar_data(data_folder, frame_id)
camera_data = load_camera_data(data_folder, frame_id)
return DataItem(annotations, lidar_data, radar_data, camera_data)
If you want to filter predictions by confidence, add these lines to the beginning of is_in_fov(...) method of /src/renderer.py:
if obj['Score'] < 0.2:
return False
Hi @TamasMatuszka , I want to know, how to visualize the 3D bounding boxes from the inference json files. Specifically I want a snippet that would parse the inference json and overlay the 3D bounding boxes on the image.
Hi @PaulSudarshan,
aiMotive Dataset Loader repository can be used with the above-mentioned modification for visualizing the 3D bounding boxes from the inference .json files. The .json files are saved using the same directory structure as the GT is stored. Therefore, only the json path shall be updated, as I showed in my previous comment.
@TamasMatuszka The 3D annotations present inside the directory /aimotive_dataset/val/highway/20210401-074452-00.01.00-00.01.15@Jarvis/dynamic/box/3d_body belong to which of the following cameras :
B_MIDRANGECAM_C F_MIDLONGRANGECAM_CL F_MIDLONGRANGECAM_CR M_FISHEYE_L M_FISHEYE_R
@PaulSudarshan the annotations are defined in the body coordinate system. For more details, please refer to Section 3.2 of the paper.
For checking whether a certain annotation is visible on a certain camera, please refer to renderer.py of the aiMotive Dataset Loader repository.
@PaulSudarshan the annotations are defined in the body coordinate system. For more details, please refer to Section 3.2 of the paper.
For checking whether a certain annotation is visible on a certain camera, please refer to renderer.py of the aiMotive Dataset Loader repository.
@TamasMatuszka it is not clear to me how the annotations of a single frame (.json) maps to multiple cameras. As I can see a single frame has a single annotated json file, how is that single annotated json file mapped to 4 different cameras.
@PaulSudarshan The annotations are in BEV space meaning all objects around the car corresponding to a given frame are contained by a JSON file. The neural network operates and detects in BEV space, therefore, this is a natural design choice. The visibility of a given object on a camera can be calculated with the code I sent previously.
@PaulSudarshan The annotations are in BEV space meaning all objects around the car corresponding to a given frame are contained by a JSON file. The neural network operates and detects in BEV space, therefore, this is a natural design choice. The visibility of a given object on a camera can be calculated with the code I sent previously.
Thanks for your explanation. I have another query with regards to visualizing the predictions in BEV space, does the aiMotive Dataset Loader repository provide support for visualization in BEV ?
@PaulSudarshan Sure, renderer.py has a render_lidar() method which is used for generating plots similar to Figure 14 in the paper.
@PaulSudarshan Sure, renderer.py has a render_lidar() method which is used for generating plots similar to Figure 14 in the paper.
Thanks @TamasMatuszka . If I want to visualize on top of an top-down camera image instead of lidar/radar, how am I supposed to get the top-down camera image that needs to be passed to the following function?
@PaulSudarshan If you want to visualize the images in BEV you can use IPM. Since you have 4 cameras, the projections need to be stitched which might not be trivial. Some references regarding IPM:
Please provide a visualization script to visualize the predicted 3D bounding box on the image. I have attached a sample image and its corresponding bounding box. Thanks frame_0007242.json