JMoonr / LATR

[ICCV2023 Oral] LATR: 3D Lane Detection from Monocular Images with Transformer
https://arxiv.org/abs/2308.04583
MIT License
173 stars 31 forks source link

NuScenes LATR implementation #20

Open samueleruffino99 opened 6 months ago

samueleruffino99 commented 6 months ago

I would like to run your model on NuScenes, have you implemented already something? Do you think it could be possible to run it over NuScenes? Especially regarding missing annotation on nuSceens, for example I have seen you are using some extra info in the inference:

output = self.head(
            dict(
                x=neck_out,
                lane_idx=extra_dict['seg_idx_label'],
                seg=extra_dict['seg_label'],
                lidar2img=extra_dict['lidar2img'],
                pad_shape=extra_dict['pad_shape'],
                ground_lanes=extra_dict['ground_lanes'] if is_training else None,
                ground_lanes_dense=extra_dict['ground_lanes_dense'] if is_training else None,
                image=image,
            ),
            is_training=is_training,
        )

I am not quite sure where I can get this extra information from nuscenes, I thought having camera parameters and image would have been enough. Thank you!! :)

samueleruffino99 commented 6 months ago

I managed to make LATR run on NuScene but unfortunately, the prediction that I get for the lanes (20 for x, 20 for z and 20 for vis) is wrong, and i am getting all vis values equal to -1.28999 (something like that). It is very strange since they are all the same and all < 0. I have checked and the problem might come from the lidar2image matrix, but I am not quite sure on which data the model needs in inference besides the image. Apparently it labels all the data with visible value < 0. This is a visualization of all the lanes that are generated without masking off the lanes with vis < 0.

Screenshot 2024-03-16 154720

JMoonr commented 5 months ago

Hi, @samueleruffino99. As far as I know, NuScenes has no 3D lane annotations like OpenLane dataset. If you want to train the model on NuScenes dataset, labels are required. And for your visualization: First, for lanes, invisible parts should be filtered aligning with the training setting. Second, sounds like you are using the OpenLane data config (including camera extrinsics and intrinsics) and model to evaluate on nuscenes. This can cause troubles, as data are based on different settings.

samueleruffino99 commented 5 months ago

Hi, actually I would just like to run inference on Nuscenes. But apparently when I am genereating the 2D canvas from the 3D plane I am not getting points inside the resize 2D image and vis outputs are all fixed to a values that is < 0 (and so masked off). But If I change the model such that it does not mask vis values < 0 that is the output that I am getting from the predicted lines (I think they are always 40 predictions). I think I am converting correctly the projection matrics because at least I can project onto the image the 3D lanes like I showed you.

samueleruffino99 commented 5 months ago

Do you think it might works simply taking your model (trained on OpenLane) and test iot on nuscenes? I would really appreciate if you have time to answer :)

Hexisteven commented 3 months ago

Hi, @samueleruffino99, excuse me,could you please share some visualization methods related to this work? I can't get the input parameters from the output of LATR to transmit to the function named 'save_result_new' in the class which is called 'Visualizer' of [PersFormer]. Thanks a lot !!!