Recommendation for using Segmentation Models

YasinBedirhanSimsek commented 7 months ago

Hey, I have been working on a similar project

Instead of using hough lines etc, you might want to look into PIDNET

It is a new segmentation network that works well enough and it is quite fast

However, you need to label your left and right lanes to train it but you could use your current code to speed that up

Another negative is the PIDNET repo itself. You need to modify some stuff to start training but there are solutions to most of the problems you might see. Other people asked about them in the issues section

PatD123 commented 7 months ago

Hey @YasinBedirhanSimsek , Thanks for the suggestion! I'm currently looking through the paper and the codebase for PIDNet. Does this model do lane detection, because it seems to just do semantic segmentation? If this is case, shouldn't we still need a lane detector model (or algo) to detect those lane lines after this segmentation?

YasinBedirhanSimsek commented 7 months ago

Yes, it is not designed for crop lanes specifically but you can still use it to detect the crop lanes. Tracking them is another story. I put up a basic gazebo simulator with a PX4-controlled rover. Recorded some videos driving around, and labeled the frames from those videos and this is one of the results.

But since it already gives you a pretty good approximation for the line that fits best on the crop lane pixels, it might be useful. It is also able to distinguish between left and right lanes. However, sometimes if the rover rotates enough it can see more than one lane (it sees the lanes for neighbor crop rows). With better training data, this could be mitigated. One other issue is sometimes it doesn't give a single continuous line for a lane but gives it in divided blobs.

The codebase for the PIDNet is written by Chinese people so some issues are asked in Chinese which has good information on some of the problems. For example, if you give a ground truth label with only the background class and no other class, training fails (because of the poor quality of training code, the model structure can handle it), etc... Also, some parts of the code are hard coded for the use cases of the writers. If you use the camvid dataset, you'll see in the codebase that the max epoch is limited to 120 so you have to modify that if you want more than 120 epoch of traning

PatD123 commented 7 months ago

Sorry, I'm still kinda confused on how PIDNet would be used to detect lanes. Are you just saying that the two lanes of crops (crops themselves) would be segmented and masked to a certain color and then wherever that color appears, it would just be a part of the left or right lanes?

YasinBedirhanSimsek commented 7 months ago

You can use it to segment the crop lanes in any way you want. The important thing is, PIDNet work really fast and good enough

You can painstakingly label each crop plant (pixel-perfect and not including any dirt pixel) on the right and left sides of the rover as red and blue. This would give you only the plants. After the inferencing, you could fit a line on them. However, I believe this would be unreliable because of the crop shapes, crop sizes, occlusion, etc.

You might also label the actual dirt path for it to detect along with the crops or without the crops. Kind of like a real car lane detector.

In my usage or in my case:

The image I uploaded previously is the inference result and the input image overlayed. The model gives an RGB image as output with some post-processing. Where red pixels belong to a "RIGHT-SIDED" crop lane and blue pixels belong to "LEFT-SIDED" crop lanes.

When labeling, I just drew one red and one blue line on top of the right and left crop lanes the rover should follow. The rover is in between them.

In this image, the ground-truth segmentation mask and the input image for it are overlayed on top of each other. This is how I labeled it.

The model learns how a crop lane on the left and right side of the rover would look like.

I just used it to detect two lanes, but you could make every crop lane white in the label mask and detect every crop lane in the way I used.

YasinBedirhanSimsek commented 7 months ago

For this image

Inference output is something like this after post-processing

PatD123 commented 7 months ago

@YasinBedirhanSimsek Ohhhh i see, thanks so much for clarifying. I get it now, How fast can this be in real time? What about compared to Deep Hough Transform or something like that? Bcoz actually for some of the research that I'm doing, we're looking into a CV line detector, so PIDNet could really help.

PatD123 / Crop-Lane-Detect

Recommendation for using Segmentation Models #2