DRAMA dataset is captured from a moving vehicle on highly interactive urban traffic scenes in Tokyo.
17,785 scenario clips captured using SEKONIX SF332X-10X video camera (30HZ frame rate, 1928 × 1280 resolution and 60 H-FOV) and GoPRO Hero 7 camera (60HZ frame rate, 2704 × 1520 resolution and 118.2 ◦ H-FOV), each clipped to 2 seconds in duration
The videos are synchronized with the Controller Area Network (CAN) signals and Inertial Measurement Unit (IMU) information.
Filtered these videos based on the ego-driver’s behavioral response to external situations or events, which activate braking of the vehicle
Contains different annotations: Video-level Q/A, Object-level Q/A, Risk object bounding box, Free-form caption, and separate labels for ego-car intention, scene classifier and suggestions to the driver.
MAPLM offers a variety of traffic scenarios, including highways, expressways, city roads, and rural roads, along with detailed intersection scenes. Each frame of data includes two components:
Point Cloud BEV: A projection image of 3D point cloud viewed from the BEV perspective with clear visuals and high resolution.
Panoramic Images: High-resolution photographs captured from front, left-rear, and right-rear angles by a wide-angle camera.
However, this dataset uses the same 5 questions per frame.,
Dataset with added question and answer annotations on frames based on nuScenes
The answers are in a short-answer format.
Example of QA)
Q: How many traffic cones are there?, A: 6Q: Are there any parked things?, A: yesQ: The thing that is to the front of the standing pedestrian and the back right of me is what?, A: traffic cone
DRAMA (Driving Risk Assesment Mechanism with A captioning module)
In WACV 2023 | paper | link
DRAMA dataset is captured from a moving vehicle on highly interactive urban traffic scenes in Tokyo.
Annotation Schema
MAPLM
link
MAPLM offers a variety of traffic scenarios, including highways, expressways, city roads, and rural roads, along with detailed intersection scenes. Each frame of data includes two components:
However, this dataset uses the same 5 questions per frame.,
nuScenes-QA
In AAAI 2024 paper | link
Example of QA)
Q: How many traffic cones are there?, A: 6
Q: Are there any parked things?, A: yes
Q: The thing that is to the front of the standing pedestrian and the back right of me is what?, A: traffic cone
DriveLM-nuScenes
link
nuScenes-MQA
In WACV 2024 paper | link