littlemountainman / modeld

Self driving car lane and path detection
https://www.youtube.com/watch?v=UFQQbTYH9hI
167 stars 53 forks source link

lane detection display on the original image space #8

Closed kaishijeng closed 3 years ago

kaishijeng commented 4 years ago

@littlemountainman

Will it be possible to overlay lane/driving prediction to the original video instead of displaying on a separate window?

There is a repository, superdrive below which is similar to your code and I believe it has better memory management. You may want to check it out https://github.com/NamoDev/SuperDrive

Thanks,

MankaranSingh commented 4 years ago

Are you asking for lane detection in image space or just the lanes (top down view) displayed in same window ?

kaishijeng commented 4 years ago

Yes, I am asking to 2 things: 1) convert lane prediction results of topdown view to the video coordinate. 2)display/overlay detected lanes in the same video window.

Item 2 is easy. Not sure how to do item1. Kaishi

MankaranSingh commented 4 years ago

I guess it isn't that easy to convert top down view to image coordinates (or is it possible at all ?). Best thing would be to ask this on comma ai's official discord group.

kaishijeng commented 4 years ago

Is the transform between video space and topdown view reversal? If not, then it is impossible. If this is the case, it may be better to convert video into topdown view and overlap with lane prediction drawing on topdown view coordinate.

MankaranSingh commented 4 years ago

They directly predict the top down coordinates and separately predict the lane lines in image coordinates for "UI purposes". There is no relation between the two

I still am not sure, but this was told by a comma staff.

EDIT: I was wrong, the output is indeed converted to image space !

kaishijeng commented 4 years ago

I think comma must use a deterministic (calibrated with its camera) to transform video space to birdeye view domain. Otherwise, how can it get training dataset? The original images in the training dataset must be in video coordinate. I notice input images has gone through a perspective transform before feeding CNN. Can we use inverse of this perspective transform to convert detected lane back to video coordinate?

MankaranSingh commented 4 years ago

I notice input images has gone through a perspective transform

afaik, it's not that bird's eye's view perspective transform that you are thinking of. It's just some lens correction.

This is what input image looks like after transforming.

image

P.S. - You should ask at comma's discord server for expert answer.

littlemountainman commented 4 years ago

So it is possible yes and I can also do it but I am not really sure on how I should switch between the two versions without always checking a bool. The transform is just there when the camera recoreded the footage is not a sony imx 298 sensor to be able to hand it over to the model. The model is really relying on this one sensor you can find an easier explaination in the webcam tools. In case you think somehow converting the bird eye view to lane lines you would have to do some localization, find the code in locationd.

MankaranSingh commented 4 years ago

So I asked in the comma ai's discord group and they said yes it is possible ! Will need to look into it though.

kaishijeng commented 4 years ago

@MankaranSingh

Great. If you find out how to do it, please let me know

Thanks,

MankaranSingh commented 4 years ago

@kaishijeng you can check the latest pull request, it isn't really perfect, but should solve your needs

kaishijeng commented 4 years ago

@MankaranSingh

I tried out your PR and it looks good even though it is not perfect yet. From my observation is converted laneline in the video space seems too short so upper lane points falls to center of video at wrong place.

Thanks for the effort.

Kaishi

littlemountainman commented 3 years ago

I will consider this closed. Thanks for the discussion.