Closed kaishijeng closed 3 years ago
Are you asking for lane detection in image space or just the lanes (top down view) displayed in same window ?
Yes, I am asking to 2 things: 1) convert lane prediction results of topdown view to the video coordinate. 2)display/overlay detected lanes in the same video window.
Item 2 is easy. Not sure how to do item1. Kaishi
I guess it isn't that easy to convert top down view to image coordinates (or is it possible at all ?). Best thing would be to ask this on comma ai's official discord group.
Is the transform between video space and topdown view reversal? If not, then it is impossible. If this is the case, it may be better to convert video into topdown view and overlap with lane prediction drawing on topdown view coordinate.
They directly predict the top down coordinates and separately predict the lane lines in image coordinates for "UI purposes". There is no relation between the two
I still am not sure, but this was told by a comma staff.
EDIT: I was wrong, the output is indeed converted to image space !
I think comma must use a deterministic (calibrated with its camera) to transform video space to birdeye view domain. Otherwise, how can it get training dataset? The original images in the training dataset must be in video coordinate. I notice input images has gone through a perspective transform before feeding CNN. Can we use inverse of this perspective transform to convert detected lane back to video coordinate?
I notice input images has gone through a perspective transform
afaik, it's not that bird's eye's view perspective transform that you are thinking of. It's just some lens correction.
This is what input image looks like after transforming.
P.S. - You should ask at comma's discord server for expert answer.
So it is possible yes and I can also do it but I am not really sure on how I should switch between the two versions without always checking a bool. The transform is just there when the camera recoreded the footage is not a sony imx 298 sensor to be able to hand it over to the model. The model is really relying on this one sensor you can find an easier explaination in the webcam tools. In case you think somehow converting the bird eye view to lane lines you would have to do some localization, find the code in locationd.
So I asked in the comma ai's discord group and they said yes it is possible ! Will need to look into it though.
@MankaranSingh
Great. If you find out how to do it, please let me know
Thanks,
@kaishijeng you can check the latest pull request, it isn't really perfect, but should solve your needs
@MankaranSingh
I tried out your PR and it looks good even though it is not perfect yet. From my observation is converted laneline in the video space seems too short so upper lane points falls to center of video at wrong place.
Thanks for the effort.
Kaishi
I will consider this closed. Thanks for the discussion.
@littlemountainman
Will it be possible to overlay lane/driving prediction to the original video instead of displaying on a separate window?
There is a repository, superdrive below which is similar to your code and I believe it has better memory management. You may want to check it out https://github.com/NamoDev/SuperDrive
Thanks,