MaybeShewill-CV / lanenet-lane-detection

Unofficial implemention of lanenet model for real time lane detection
Apache License 2.0
2.36k stars 886 forks source link

Some thoughts on training Lanenet with CUlane dataset #183

Closed guozixunnicolas closed 2 years ago

guozixunnicolas commented 5 years ago

Hi everyone, I have been trying to train Lanenet with Culane dataset for the past 2 weeks. I wanna share a little of my thoughts on the results.

Firstly, to kick start the training, CUlane dataset should be somehow modified

  1. Any empty(ie. absolute black in binary label) images should be removed else loss will become Nan.
  2. convert all the jpg file format to png format(I dont know why jpg won't work for lanenet, if you have some idea please comment below)
  3. change L2 normalization to L1 norm in the discrimination loss file by simply adding a term ord=1 else the gradient will become infinity resulting in instability(loss is nan).

Failing to modify the dataset will result in unpredictability in training process(say the training stops halfway or even cannot start).

To finetune the model, I had some try on decreasing the learning rate. 0.0005 is the original learning rate. 0.0004/0.00045 is some proper learning rate for model to converge in a reasonably short period. I have tried to low it down to 0.0003/0.0002, the model converges way too slow! It may take days to converge. The accuracy of the val set converges to 60% or so in my few tries. Below is my training summary: Screenshot from 2019-03-11 10-03-25

Firstly I thought. oh well, the model may stuck at a local minimum so I increase the momentum from 0.9 to 0.95. The result is roughly the same.

Then I realize that it has nothing to do with local minimum kinda stuff. I looked through Culane dataset, the reason of the low accuracy(correct me if im wrong) is that Lanenet cannot perform well on occlusion or other harsh road condition. In some pictures all lanes are totally occluded by cars but in the ground truth the lane label is still there. Thats why the message passing module in SCNN can be effective. And that maybe the reason why there is a very variance in binary loss in above tensorboard summary cuz sometimes the model cannot predict there is a lane.

In summary, lanenet doesnt perform as well as SCNN for CUlane dataset even though it can converge. Please correct me if I am wrong anywhere above(since I am just a student loll, dont judge me haha). Also if you have some idea 1.why the accuracy cannot increase above 60% or so 2. binary loss has a high variance please comment below.

Thanks the author the the great work and instant reply, thanks @cardwing and @PenghuahuaPeng for the contribution as well.

Best regards,

ZX(Nicolas)

MaybeShewill-CV commented 5 years ago

@zguo008 The reason why jpg is not suitable may be that the actual pix value of jpg image may be modified during generating process(JPEG and PNG have different compress method). You may check if the jpg label's value is correct:)

guozixunnicolas commented 5 years ago

Thanks for the reply:)

CHYangzzz commented 5 years ago

@zguo008 Hi, i try to train Lanenet with CULane, but accuracy stays at a very low value.

微信截图_20190411102822

I follow your tips 1&2 , and adding a term ord=1 to all the norm in "lanenet_discriminative_loss.py" . Would you please give me some suggestions for the above

zacario-li commented 5 years ago

@zguo008 Hi, i try to train Lanenet with CULane, but accuracy stays at a very low value.

微信截图_20190411102822

I follow your tips 1&2 , and adding a term ord=1 to all the norm in "lanenet_discriminative_loss.py" . Would you please give me some suggestions for the above

Maybe you can first train binary seg branch , when you find the predicted binary_seg_img is good enough, then continue to train the whole network.

guozixunnicolas commented 5 years ago

@CHYangzzz @zacario-li 's answer makes sense for me. Low accuracy is because the model cannot detect some lanes, mainly due to the binary branch.You can have a try:)

CHYangzzz commented 5 years ago

@zacario-li @zguo008 Thanks for your advice

yinhai86924 commented 5 years ago

Hi everyone, I have been trying to train Lanenet with Culane dataset for the past 2 weeks. I wanna share a little of my thoughts on the results.

Firstly, to kick start the training, CUlane dataset should be somehow modified

  1. Any empty(ie. absolute black in binary label) images should be removed else loss will become Nan.
  2. convert all the jpg file format to png format(I dont know why jpg won't work for lanenet, if you have some idea please comment below)
  3. change L2 normalization to L1 norm in the discrimination loss file by simply adding a term ord=1 else the gradient will become infinity resulting in instability(loss is nan).

Failing to modify the dataset will result in unpredictability in training process(say the training stops halfway or even cannot start).

To finetune the model, I had some try on decreasing the learning rate. 0.0005 is the original learning rate. 0.0004/0.00045 is some proper learning rate for model to converge in a reasonably short period. I have tried to low it down to 0.0003/0.0002, the model converges way too slow! It may take days to converge. The accuracy of the val set converges to 60% or so in my few tries. Below is my training summary: Screenshot from 2019-03-11 10-03-25

Firstly I thought. oh well, the model may stuck at a local minimum so I increase the momentum from 0.9 to 0.95. The result is roughly the same.

Then I realize that it has nothing to do with local minimum kinda stuff. I looked through Culane dataset, the reason of the low accuracy(correct me if im wrong) is that Lanenet cannot perform well on occlusion or other harsh road condition. In some pictures all lanes are totally occluded by cars but in the ground truth the lane label is still there. Thats why the message passing module in SCNN can be effective. And that maybe the reason why there is a very variance in binary loss in above tensorboard summary cuz sometimes the model cannot predict there is a lane.

In summary, lanenet doesnt perform as well as SCNN for CUlane dataset even though it can converge. Please correct me if I am wrong anywhere above(since I am just a student loll, dont judge me haha). Also if you have some idea 1.why the accuracy cannot increase above 60% or so 2. binary loss has a high variance please comment below.

Thanks the author the the great work and instant reply, thanks @cardwing and @PenghuahuaPeng for the contribution as well.

Best regards,

ZX(Nicolas) hello!How to convert CUlane data set into lanenet data set, CUlane data set seems to be able to generate binary image labels without instance segmentation icon labels. what tools are used for it? thank you !

sravanje commented 5 years ago

For me, the loss becomes nan unless I reshape the Culane images to 720x1280. Do the dimensions matter for training with lanenet? Side note: Even when I resize them and train, the accuracy is still very bad.

TGLTommy commented 5 years ago

@zguo008 Sorry to ask how did you set up the training for CULane datasets ? How to change CULane format into TuSimple format ? Sorry, I start to study Lane Detection not long ago.

TGLTommy commented 5 years ago

@zacario-li @CHYangzzz 您好,两位大佬,菜鸟上路,向二位请教两个问题: 1、如何将CULane格式转成TuSimple格式? 2、如何计算Accuracy , FP, FN ? 如果有源码,可否分享一下?这个问题纠结几天了,自己写了一些代码,一直出错,麻烦指点一下。

BurkeyLai commented 4 years ago

@zacario-li , @zguo008, @CHYangzzz \: How to train binary seg branch first? Should I modify the network? Would you please give me some suggestions? Many thanks! In train_lanenet.py \:

_, c, train_accuracy, train_summary, binary_loss, instance_loss, embedding, binary_seg_img = \
                sess.run([optimizer, total_loss,
                          accuracy,
                          train_merge_summary_op,
                          binary_seg_loss,
                          disc_loss,
                          pix_embedding,
                          out_logits_out],
                         feed_dict={input_tensor: gt_imgs,
                                    binary_label_tensor: binary_gt_labels,
                                    instance_label_tensor: instance_gt_labels,
                                    phase: phase_train})
shijia-web commented 4 years ago

@zacario-li @CHYangzzz 您好,两位大佬,菜鸟上路,向二位请教两个问题: 1、如何将CULane格式转成TuSimple格式? 2、如何计算Accuracy , FP, FN ? 如果有源码,可否分享一下?这个问题纠结几天了,自己写了一些代码,一直出错,麻烦指点一下。 你好,请问你找到将CULane数据集转换为tusimple数据集格式的方法了吗,我也是刚刚学习车道线的检测,想向您请教下是否找到了解决办法。另外,我有一些计算Accuracy,FP,FN的源码,不知道你还需不需要。

dpramirez commented 4 years ago

@zguo008 Is there any way that someone would share a dataset or the ckpt files with me, and I have had many problems trying to train with my data, and if someone already has it with a greater amount of data, happy to receive it. Thanks

vanilla000 commented 4 years ago

为什么使用tusimple数据训练完成后,使用自己的数据进行测试时,效果特别差呢,这个原因是什么可以说说吗各位

vanilla000 commented 4 years ago

大佬 ,我想问下tusimple_ipm_remap.yml这个是不是将当前帧整个图片转成鸟瞰图?如果是的话,那应用到自动驾驶中的话,是否需要实时更新这个yml里的参数?

AndrewJSong commented 4 years ago

@ zacario利 @CHYangzzz 您好,两位大佬,菜鸟上路,向二位请教两个问题: ?1,如何将CULane格式转成TuSimple格式 ?2,如何计算精度,FP,FN 如果有源码,可否分享一下?这个问题纠结几天了,自己写了一些代码,一直出错,麻烦指点一下。 你好,请问你找到将CULane数据集转换为tusimple数据集格式的方法了吗,我也是刚刚学习车道线,检测,想向您请教下是否找到解决方法。另外,我有一些计算精度,FP,FN的源码,不知道你还需不需要。

你好,兄弟,计算Accuracy,FP,FN的源码,可以和我分享下吗?不胜感激!40592080@qq.com