Hi, I have some problem with your post process for the lane detection.

SeokjuLee / VPGNet

VPGNet: Vanishing Point Guided Network for Lane and Road Marking Detection and Recognition (ICCV 2017)

MIT License

487 stars 166 forks source link

Hi, I have some problem with your post process for the lane detection. #4

Open weitaoatvison opened 6 years ago

weitaoatvison commented 6 years ago

Hi, I found in your paper you declared your algorithm speed could achieve to 20Fps, so I am confused whether the speed you declared is just the speed of the network inference time or with the post process included clustering and curve fitting? If it is with the post-process, could you share your post process code? Thanks for your good job!

SeokjuLee commented 6 years ago

@weitaoatvison Hi, the algorithm speed, 20fps, includes both forward-pass and post-processing time. Specifically, the single forward pass takes about 30ms (on single Titan X) and the post-processing takes about 20ms or less. For the post-processing code, we need to get an additional permission from Samsung Research.. (T_T) I recommend you to re-implement it following our post-processing part (except VP) because it covers only a few lines.

weitaoatvison commented 6 years ago

OK, Thanks for your reply. I have done a simillar work but the post-processing time is too large, so I am interested in your post-processing code. HaHa~

SeokjuLee commented 6 years ago

@weitaoatvison Here are some helpful libraries and functions, which are fast. Please refer them. Lane-seed-sampling: https://github.com/MonsieurV/py-findpeaks IPM: https://docs.opencv.org/2.4/modules/core/doc/operations_on_arrays.html#perspectivetransform

weitaoatvison commented 6 years ago

@SeokjuLee Thanks very much! In our implementation, we use an IPM we implement by our-self. And I am wonder that how to use https://github.com/MonsieurV/py-findpeaks to do lane-seed-sampling? I find this code is used to do peak detection.

SeokjuLee commented 6 years ago

@weitaoatvison It's for the lane clustering (visualization). This enables to subsample peak points from the probability map. Please refer our paper (related section 4.4)

weitaoatvison commented 6 years ago

@SeokjuLee Thanks! I will check the paper again!

ArtyomKa commented 6 years ago

Hi. Where can I find explanation about the structure of the multi-classification task output. I understand that 80x60 part is the input image size divided by 8, but what is the 64 part?

SeokjuLee commented 6 years ago

@ArtyomKa That part assigns the number of types. The number of channel 64 for the multi-label task includes auxiliary classes.

ArtyomKa commented 6 years ago

@SeokjuLee Thanks.. In the paper (Table 2) there are only about 20 classes listed. There are additional 44 aux. classes? The first 20 are the ones that are in the table, or am i missing something..

SeokjuLee commented 6 years ago

@ArtyomKa Yes, there is no problem if it is set to be larger than the number of classes to be detected. In my work there are 17 classes, and empty gradient from the rest 47 (=64-17) channels. The remaining 47 classes are regarded as meaningless classes

ArtyomKa commented 6 years ago

@SeokjuLee OK, makes sense! Thank You!

chengm15 commented 6 years ago

@SeokjuLee I have a question about the detection of lane. Different from the road marking, the lane is thin and long. According to Figure 3 in your paper, the size of grid box is 120x160x4 and there are 120x160 bound boxes. I want to know whether the bounding box of label cover the entire lane or just cover a part of entire lane.

SeokjuLee commented 6 years ago

@chengm15 The bounding box of label covers just a part of a lane. By generating grids to train them, we did not have to consider the various scales.

HuifangZJU commented 6 years ago

@SeokjuLee Hi, what are the indexes of the 17 meaningful channels in the 64 channel output of multi-label task? Are they the first 1~17 channels, one to one corresponding to your Table 2? I have output the pobabilies, most grid are belonging to the firs class, so I suppose the first channel should be some background class? or am I wrong?