A question about labeling

asadnorouzi commented 2 years ago

Thank you for releasing this excellent work by your team!

Could you provide more information about your labeling pipeline for 3D lanes? Are they labeled by human labelers or by PersFormer? If they are labeled by PersFormer, how did you incorporate the LiDAR data?

ilnehc commented 2 years ago

@asadnorouzi Thank you for your interest. The overall procedure is that human labelers annotate 2D lane lines on images and we extract LiDAR points based on them. Additional interpolation, filtering and smoothing steps are executed to generate the final 3D annotation.

PersFormer is the method to detect 3D lane lines, trained on annotated OpenLane dataset. However, we are actually exploring using it as an automatic labeling tool to reconstruct the environment.

asadnorouzi commented 2 years ago

@ilnehc Thank you for clarifying the labeling process. One more question I have is how do you think your framework would perform if it is provided with LiDAR points (fusion of Camera + LiDAR)?

ilnehc commented 2 years ago

@asadnorouzi In my own perspective, fusion is a promising way to enhance the performance of object detection, while it probably cannot help a lot in the laneline detection task. There are a few reasons to this conclusion: 1) The lanelline detection task often requires a long-range distance, which is out of the typical LiDAR range. 2) Lanelines are slim and LiDAR points on lines in one frame are quite sparse. However, LiDAR points can provide more 3D ground information if you would like to do 3D laneline detection. Overall, it is an open question and not fully discovered yet.

asadnorouzi commented 2 years ago

@ilnehc Yes, I agree with you. Maybe I should have mentioned it earlier. The reason I asked about LiDAR is that I am interested in 3D lane detection. I am in the process of selecting 2-3 frameworks to create my baselines. From the frameworks I have studied so far, your work's results are more promising ;) For the initial stage, I would like to use LiDAR data too. However, I am a bit skeptical about whether your framework is suitable for incorporating (fusing) LiDAR data. So, I thought it is best to ask the experts :) What is your opinion about this? Do you see that your framework is capable of accepting and benefiting from LiDAR data?

ilnehc commented 2 years ago

@asadnorouzi I think our BEV feature design and the transformer-based module is useful for fusion with LiDAR, based on recent popular BEV fusion works such as BEVFusion(MIT), BEVFusion(PKU/ALi) and DeepFusion. I hope this can help you.

Mollylulu commented 2 years ago

@asadnorouzi Thank you for your interest. The overall procedure is that human labelers annotate 2D lane lines on images and we extract LiDAR points based on them. Additional interpolation, filtering and smoothing steps are executed to generate the final 3D annotation.

PersFormer is the method to detect 3D lane lines, trained on annotated OpenLane dataset. However, we are actually exploring using it as an automatic labeling tool to reconstruct the environment.

hi~ , for the labeling problem, I have one more question: how do you operate the point interpolation? is it done on the 3D points directly ? or it is done in another way? if so , how does the label generated? It would be grateful if you could help.

ilnehc commented 2 years ago

@Mollylulu Yes, it is done on the 3D points. The main idea is that we interpolate those 2D lane locations which do not have corresponding 3D points.

Mollylulu commented 2 years ago

Furthermore, 3D points got from the mentioned interpolation, of which parts are maybe on some objects (e.g., cars) . How do you handle these points? @ilnehc

ilnehc commented 2 years ago

@Mollylulu Note that we filter 3D points inside object bounding boxes at the first step.

Mollylulu commented 2 years ago

based on the statement, I am not sure:

if the abrasion parts would be labeled , and given "visible" property?
if some occluded lane (e.g., red boxes in above figure), would be labeled , and also given "visible" property. since they are likely in line with "completed with context."

ilnehc commented 2 years ago

@Mollylulu The answer to both 1&2 is YES. And please note the "invisibility" property is only used for points too far (typically those higher than 2D annotation on the image).

Mollylulu commented 2 years ago

noted, thanks! btw, there is a huge gap between evaluation results of your model calculated based on previous code version and current v1.1. (i.e., F1 score increased 30 points almost on same checkpoint ) could you help check relevant calculation if something is wierd. thanks again!

ilnehc commented 2 years ago

@Mollylulu The gap is from the way we deal with invisible gt/pred lanes. The original eval method inherits from GenLaneNet and leads to NOT 100% F1 when testing GT against GT. More detailed discussions are in https://github.com/OpenPerceptionX/OpenLane/issues/15. We believe it is not reasonable and update accordingly. But there still exist some small problems that we are working on, please pay attention to https://github.com/OpenPerceptionX/OpenLane/issues/20.

Mollylulu commented 2 years ago

noted :)

OpenDriveLab / OpenLane

A question about labeling #19