How to convert data format

qjl1244281167 commented 6 months ago

I want to use this data set to achieve target detection. I want to get the label box of each instance (such as vehicles, pedestrians). How do I convert it? I want the format of VOC or COCO. If there is a YOLO format, that would be more alright

XuShenLZ commented 6 months ago

Hi, I think you might want to have the raw video & annotation format of the data. Have you requested it? https://sites.google.com/berkeley.edu/dlp-dataset

qjl1244281167 commented 6 months ago

您好，我想您可能想要数据的原始视频和注释格式。你要求了吗？https://sites.google.com/berkeley.edu/dlp-dataset Yes, I have requested the dataset and I have received a download link for the dataset, but I don't quite understand the meaning of the content in the annotation files for these datasets and I don't know how to use it

XuShenLZ commented 6 months ago

Sorry we don't provide any code to process the raw video & annotation format of the data, However, I believe that the XML file should be easy to interpret. For example, the following rows describe two vehicles at frame 0.

<tracking_log>
    <frame id="0" timestamp="0.000000">
        <trajectory id="1" type="Car" width="2.1613" length="4.5655" utm_x="746952.76" utm_y="3856823.94" utm_angle="4.7437" speed="0.00" lateral_acceleration="0.0000" tangential_acceleration="0.0000" total_acceleration="0.0000" front_left_x="3321.38" front_left_y="1725.74" front_right_x="3397.30" front_right_y="1721.60" rear_left_x="3326.60" rear_left_y="1886.09" rear_right_x="3402.52" rear_right_y="1881.85" front_left_x_undistorted="3317.51" front_left_y_undistorted="1728.82" front_right_x_undistorted="3393.80" front_right_y_undistorted="1725.24" rear_left_x_undistorted="3324.19" rear_left_y_undistorted="1890.92" rear_right_x_undistorted="3400.53" rear_right_y_undistorted="1887.28"/>
        <trajectory id="2" type="Car" width="2.1341" length="4.6388" utm_x="746958.03" utm_y="3856823.98" utm_angle="4.7354" speed="0.00" lateral_acceleration="0.0000" tangential_acceleration="0.0000" total_acceleration="0.0000" front_left_x="3136.52" front_left_y="1729.51" front_right_x="3211.80" front_right_y="1726.09" rear_left_x="3140.47" rear_left_y="1892.73" rear_right_x="3215.74" rear_right_y="1889.21" front_left_x_undistorted="3131.97" front_left_y_undistorted="1731.24" front_right_x_undistorted="3207.45" front_right_y_undistorted="1728.33" rear_left_x_undistorted="3137.27" rear_left_y_undistorted="1896.15" rear_right_x_undistorted="3212.80" rear_right_y_undistorted="1893.18"/>

There is also transformation matrix at the end of each frame to transform between image and utm coordinates

        <raw_to_map_transform image_id="0" cols="3" rows="3">
            <row_0 val_0="-3.4521392297664617" val_1="3.379297215674177" val_2="747677.70416361105"/>
            <row_1 val_0="-17.67920643202228" val_1="17.475174131973677" val_2="3860030.1877205535"/>
            <row_2 val_0="-4.5839916216011457e-06" val_1="4.5237260954759028e-06" val_2="1.000844809535552"/>
        </raw_to_map_transform>

MPC-Berkeley / dlp-dataset

How to convert data format #8