How to use evaluate.py for object detection evaluation?

SM1991CODES commented 7 months ago

Hi, I am trying to use the dataset for BEV object detection and have the following questions. 1) I see there is a DetectionEval class in evaluate.py file - is there is some example about how to use this for formatting predictions into JSON for evaluation?

2) How does orinetation affect evaluation score, i.e., if true orientation is 0 degree and my CNN predicts 180 degrees, will the IoU and final evaluation be affected?

3) How can I evaluate for just a single or couple of specific classes? For example, how to evaluate for just car detection?

Best Regards

whyekit-motional commented 7 months ago

@SM1991CODES regarding your various queries:

I see there is a DetectionEval class in evaluate.py file - is there is some example about how to use this for formatting predictions into JSON for evaluation?

We let users decide how they want to convert their predictions into the required results format, since different models are likely to have different format for their predictions

How does orinetation affect evaluation score, i.e., if true orientation is 0 degree and my CNN predicts 180 degrees, will the IoU and final evaluation be affected?

Orientation error is evaluated at 360 degree for all classes except barriers where it is only evaluated at 180 degrees (pls refer to here)

How can I evaluate for just a single or couple of specific classes? For example, how to evaluate for just car detection?

The evaluation results include the per-class performance, so you could check the performance on cars from there

SM1991CODES commented 7 months ago

Thank you for the quick response. Everything is clear except the orientation point.

I am not sure I understood 360 degree point. What I meant was - every orientation between 0-360 can be mapped to 0 to 180 or 90 to -90 while maintaining the same box appearance and object points coverage. This is not a problem in KiTTI since it's bev eval just measures box coverage and not actual orientation angle value. Is this similar in nuscenes? Or will the wrong angle due to this mapping operation cause bad evaluation score?

The mapping step helps reduce the decision space for orientation prediction.

Best Regards Sambit

whyekit-motional commented 7 months ago

@SM1991CODES the detection evaluation in nuScenes also considers the actual orientation by measuring AOE (Average Orientation Error), as defined in Section 3.1 of https://arxiv.org/pdf/1903.11027.pdf

SM1991CODES commented 7 months ago

Okay, that makes it clear. Thank you. Some more questions:

1) I transform point clouds and boxes to ego frame for training and would also like to evaluate in ego frame. Is this possible? I think by default boxes are defined in LiDAR frame (for 3D object detection). Do I have to transform the detections back to lidar frame before creating the JSON?

2) Also, is it possible to train and evaluate on a specific FOV? Like in KITTI, can we only evaluate on the front of the ego vehicle?

whyekit-motional commented 7 months ago

@SM1991CODES here are my replies to your queries:

I transform point clouds and boxes to ego frame for training and would also like to evaluate in ego frame. Is this possible? I think by default boxes are defined in LiDAR frame (for 3D object detection). Do I have to transform the detections back to lidar frame before creating the JSON?

Pls take a look at the results format - it requires the predictions to be in the global frame

Also, is it possible to train and evaluate on a specific FOV? Like in KITTI, can we only evaluate on the front of the ego vehicle?

Pls see https://github.com/nutonomy/nuscenes-devkit/issues/724

SM1991CODES commented 7 months ago

Thanks a lot for all the info. I think I can work my way through from here.

Best Regards Sambit

SM1991CODES commented 7 months ago

One final question - I need to transform my detections to global frame from ego frame. While I know I can do this using the ego poses, is there a function in the devkit already doing this? It's best not to rewrite already present stuff.

SM1991CODES commented 7 months ago

Here is a small function I made to do this flat vehicle to global transform

` def test_transforms_ego_to_global(nusc_obj, sample_ann_tokens, sample_data_token, boxes_ego): """

Args:
    sample_ann_tokens (_type_): _description_
    sample_data_token (_type_): _description_
"""

ego_pose = nusc_obj.get("ego_pose", sample_data_token)
for index, ann_token in enumerate(sample_ann_tokens):
    box_global = nusc_obj.get("sample_annotation", token=ann_token)
    box_flat = boxes_ego[index]
    box_flat_cpy = box_flat.copy()

    if "vehicle.car" not in box_global["category_name"]:
        continue

    if box_flat.token == box_global["token"]:
        rot_quat = pyquat.Quaternion(ego_pose['rotation'])
        t = ego_pose['translation']

        # convert flat coordinates to global
        box_flat_cpy.rotate(rot_quat)
        box_flat_cpy.translate(t)

        print("===== translation =======")
        print("box_flat_to_global -> {0}, box_global -> {1}".format(box_flat_cpy.center,
                                                                            box_global["translation"]))
        print("======== Orientation-Yaw ========")
        print("box_flat_to_global -> {0}, box_global -> {1}".format(box_flat_cpy.orientation.yaw_pitch_roll[0],
                                                                            pyquat.Quaternion(box_global["rotation"]).yaw_pitch_roll[0]))

`

However, the z-translation values don't look nice on comparing

What am I missing here?

whyekit-motional commented 7 months ago

@SM1991CODES you can see here for an example of performing transforms among various coordinate frames: https://github.com/nutonomy/nuscenes-devkit/blob/9b165b1018a64623b65c17b64f3c9dd746040f36/python-sdk/nuscenes/nuscenes.py#L881-L900

SM1991CODES commented 7 months ago

Okay, I will try to follow along. Is there a difference between "ego" frame and "flat_vehicle_coordinates"? As seen in the attached output log, only the z value in the flat_vehicle to global transformed point seems to be lower than that in the original annotation in global coordinates.

whyekit-motional commented 7 months ago

@SM1991CODES flat_vehicle_coordinates refers to the ego with the z-coordinates projected to the ground plane (i.e. z=0)

SM1991CODES commented 7 months ago

This explains why my transformation from flat_vehicle to global frame does not match original annotation in z-coordinates, what do you think?

My reprojected z is less than original annotation's z, because it's no longer on ego rear axle, but on ground. I will still try out the code snippet you have pointed out, though I feel it will also face the same problem.

SM1991CODES commented 7 months ago

I ran into another issue while generating training data - I want to use only 5 out of the 10 train_val data blobs.

But I think run into file not found issue since I did not download and extract the whole dataset. Is there a way around this?

SM1991CODES commented 7 months ago

Basically, I would just want to use trainval_01-04 for training and validation using their respective train and val split scenes. How can I do this?

SM1991CODES commented 7 months ago

Okay, I think I figured this one out. I can modify the train, train_detect and val lists to include only scenes from the parts I am interested in.

whyekit-motional commented 7 months ago

This explains why my transformation from flat_vehicle to global frame does not match original annotation in z-coordinates, what do you think?

Yes, you cannot simply go from flat_vehicle ego frame to global frame - I would suggest to use the actual ego frame

SM1991CODES commented 7 months ago

I have one more problem - how do I download data to a remore ssh server using wget or similar? I found this: https://github.com/nutonomy/nuscenes-devkit/issues/110

But it does not work for me. I tried : wget -o part1.tgz ""

It just saves a small file with that name but not the actual download does not start.

whyekit-motional commented 7 months ago

@SM1991CODES yeah, unfortunately that method no longer works - we are in the process of finding another way to allow users to download the data via the command line

nutonomy / nuscenes-devkit

How to use evaluate.py for object detection evaluation? #1014