What should I do if I want to train this model on my own dataset?

KyleYueye commented 2 years ago

I notice that for each dataset, info files (.pkl) are generated. Do you have any rules about generating these info files? Because I want to train this model on my own dataset. Thanks a lot.

jskhu commented 2 years ago

It depends on the dataset, but typically the pickle files contain labels, calibration, and basic metadata. For example, KITTI's train pickle file contains the following information for each training sample:

>>> import pickle
>>> import pprint
>>> pp = pprint.PrettyPrinter()
>>> with open('kitti_infos_train.pkl', 'rb') as f:
...     data = pickle.load(f)
...
>>> pp.pprint(data[0])
{'annos': {'alpha': array([-0.2]),
           'bbox': array([[712.4 , 143.  , 810.73, 307.92]], dtype=float32),
           'difficulty': array([0], dtype=int32),
           'dimensions': array([[1.2 , 1.89, 0.48]]),
           'gt_boxes_lidar': array([[ 8.73138046, -1.85591757, -0.65469939,  1.2       ,  0.48      ,
         1.89      , -1.58079633]]),
           'index': array([0], dtype=int32),
           'location': array([[1.84, 1.47, 8.41]], dtype=float32),
           'name': array(['Pedestrian'], dtype='<U10'),
           'num_points_in_gt': array([377], dtype=int32),
           'occluded': array([0.]),
           'rotation_y': array([0.01]),
           'score': array([-1.]),
           'truncated': array([0.])},
 'calib': {'P2': array([[ 7.07049316e+02,  0.00000000e+00,  6.04081421e+02,
         4.57583084e+01],
       [ 0.00000000e+00,  7.07049316e+02,  1.80506607e+02,
        -3.45415711e-01],
       [ 0.00000000e+00,  0.00000000e+00,  1.00000000e+00,
         4.98101581e-03],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
         1.00000000e+00]]),
           'R0_rect': array([[ 0.9999128 ,  0.01009263, -0.00851193,  0.        ],
       [-0.01012729,  0.9999406 , -0.00403767,  0.        ],
       [ 0.00847067,  0.00412352,  0.9999556 ,  0.        ],
       [ 0.        ,  0.        ,  0.        ,  1.        ]],
      dtype=float32),
           'Tr_velo_to_cam': array([[ 0.00692796, -0.99997222, -0.00275783, -0.02457729],
       [-0.00116298,  0.00274984, -0.99999553, -0.06127237],
       [ 0.99997532,  0.00693114, -0.0011439 , -0.33210289],
       [ 0.        ,  0.        ,  0.        ,  1.        ]])},
 'image': {'image_idx': '000000',
           'image_shape': array([ 370, 1224], dtype=int32)},
 'point_cloud': {'lidar_idx': '000000', 'num_features': 4}}

Unless your data is extremely different, my suggestion would be to convert your data to a KITTI-like format so you can use OpenPCDet's dataset framework easily. An example Waymo to KITTI converter is here: https://github.com/caizhongang/waymo_kitti_converter. You can probably do something similar with your dataset.

KyleYueye commented 2 years ago

It depends on the dataset, but typically the pickle files contain labels, calibration, and basic metadata. For example, KITTI's train pickle file contains the following information for each training sample:

>>> import pickle
>>> import pprint
>>> pp = pprint.PrettyPrinter()
>>> with open('kitti_infos_train.pkl', 'rb') as f:
...     data = pickle.load(f)
...
>>> pp.pprint(data[0])
{'annos': {'alpha': array([-0.2]),
           'bbox': array([[712.4 , 143.  , 810.73, 307.92]], dtype=float32),
           'difficulty': array([0], dtype=int32),
           'dimensions': array([[1.2 , 1.89, 0.48]]),
           'gt_boxes_lidar': array([[ 8.73138046, -1.85591757, -0.65469939,  1.2       ,  0.48      ,
         1.89      , -1.58079633]]),
           'index': array([0], dtype=int32),
           'location': array([[1.84, 1.47, 8.41]], dtype=float32),
           'name': array(['Pedestrian'], dtype='<U10'),
           'num_points_in_gt': array([377], dtype=int32),
           'occluded': array([0.]),
           'rotation_y': array([0.01]),
           'score': array([-1.]),
           'truncated': array([0.])},
 'calib': {'P2': array([[ 7.07049316e+02,  0.00000000e+00,  6.04081421e+02,
         4.57583084e+01],
       [ 0.00000000e+00,  7.07049316e+02,  1.80506607e+02,
        -3.45415711e-01],
       [ 0.00000000e+00,  0.00000000e+00,  1.00000000e+00,
         4.98101581e-03],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
         1.00000000e+00]]),
           'R0_rect': array([[ 0.9999128 ,  0.01009263, -0.00851193,  0.        ],
       [-0.01012729,  0.9999406 , -0.00403767,  0.        ],
       [ 0.00847067,  0.00412352,  0.9999556 ,  0.        ],
       [ 0.        ,  0.        ,  0.        ,  1.        ]],
      dtype=float32),
           'Tr_velo_to_cam': array([[ 0.00692796, -0.99997222, -0.00275783, -0.02457729],
       [-0.00116298,  0.00274984, -0.99999553, -0.06127237],
       [ 0.99997532,  0.00693114, -0.0011439 , -0.33210289],
       [ 0.        ,  0.        ,  0.        ,  1.        ]])},
 'image': {'image_idx': '000000',
           'image_shape': array([ 370, 1224], dtype=int32)},
 'point_cloud': {'lidar_idx': '000000', 'num_features': 4}}

Unless your data is extremely different, my suggestion would be to convert your data to a KITTI-like format so you can use OpenPCDet's dataset framework easily. An example Waymo to KITTI converter is here: https://github.com/caizhongang/waymo_kitti_converter. You can probably do something similar with your dataset.

Thank you

TRAILab / PDV

What should I do if I want to train this model on my own dataset? #4