voldemortX / pytorch-auto-drive

PytorchAutoDrive: Segmentation models (ERFNet, ENet, DeepLab, FCN...) and Lane detection models (SCNN, RESA, LSTR, LaneATT, BézierLaneNet...) based on PyTorch with fast training, visualization, benchmarking & deployment help
BSD 3-Clause "New" or "Revised" License
850 stars 138 forks source link

Questions about the number of lanes #36

Closed August-en closed 2 years ago

August-en commented 3 years ago

In opensource datasets, there are some limits about the maximum number of lanes.(Tusimple is 5, CULane is 4...)

When I train the lanedetection algorithm on my datasets, I have preprocessed my data annotation format as same as Tusimple format, but my maximum number of lanes is larger than 5 (maybe 8 or 9). What will happen if I directly use your codebase to train on my dataset without any modification?

If I need to modify some codes to fit my number of lanes, can you tell me where should I modify?

Thanks a lot :)

voldemortX commented 3 years ago

Args related to TuSimple max lane numbers are in config.yaml, from L121-L126. Some methods has other parameters that may be influenced, for instance,

LSTR restricts a max prediction num 7: https://github.com/voldemortX/pytorch-auto-drive/blob/4c7e1b9eec592f0bd0637133cd34e9969319df18/torchvision_models/lane_detection/lstr.py#L32

And the bottleneck in segmentation lane existence branch (in this case 5) should not fall too far from MAX_LANE: https://github.com/voldemortX/pytorch-auto-drive/blob/4c7e1b9eec592f0bd0637133cd34e9969319df18/torchvision_models/common_models.py#L277

voldemortX commented 3 years ago

That's about all I can recall that depends on max lane numbers. One thing is worth noting though. TuSimple has 5 lanes, but segmentation methods must see it as 6 (left and right).

August-en commented 3 years ago

Thank you very much for your reply.

Actually, I want to train LSTR on my dataset. As mentioned above, I have preprocessed my data annotation format as same as Tusimple json file format. But I found there are two more annotation files in your codebase, segmentation labels, list6_train.txt.

In your extra two annotation files, it seems like have annotation about which lane from left to right. But I do not have this annotation, I only know there are how many lanes in one images. If I just give them the indexes in range(0, max num of lanes in this img) from left to right, and then generate segmentation label map, every thing will be run normally?

Further more, what's the difference between annotation 0 1 1 1 1 0 0 and 0 0 1 1 1 1 0 in LSTR?

voldemortX commented 3 years ago

I don't think you need segmentation labels for LSTR?

The 01 can be anything i think. LSTR loader wouldn't load them.

dreamlover0 commented 3 years ago

非常感谢您的回复。

小鸟,我想在我的数据集上训练 LSTR。上面,我已经将我的注释格式与 Tusimple 文件格式进行了数据格式。但是我发现您的代码库中还有两个注释文件,翻译标签,list6_train.txt。

在你额外的两个注释文件中,似乎有关于从左到右哪条车道的注释。但是我没有这个注释,我只知道一张图片有多少条车道。如果我只是给他们从左到右范围里面的索引(0,这个img中的最大数据),然后生成翻译标签图,一切正常运行吗?

另外,LSTR 中的注释 0 1 1 1 1 0 0 和 0 0 1 1 1 1 0 之间有什么区别?

Hello, can i ask how do you label your tusimple dataset? What tools do you use? How to generate the JSON file required by tusimple because it requires isometric annotation. thank you for your reply !

voldemortX commented 2 years ago

@August-en Hi! Was your LSTR training problem resolved? In fact the current PytorchAutoDrive provides a config-based framework and it should be much easier to integrate a new dataset. Btw, could you give some insight on dataset labeling for @dreamlover0 , thank you very much.

August-en commented 2 years ago

@August-en Hi! Was your LSTR training problem resolved? In fact the current PytorchAutoDrive provides a config-based framework and it should be much easier to integrate a new dataset. Btw, could you give some insight on dataset labeling for @dreamlover0 , thank you very much.

Thanks for your reply. Finally, I trained the LSTR algorithm with the codes in the paper. But I'm following your latest paper BezierLaneNet.

August-en commented 2 years ago

非常感谢您的回复。 小鸟,我想在我的数据集上训练 LSTR。上面,我已经将我的注释格式与 Tusimple 文件格式进行了数据格式。但是我发现您的代码库中还有两个注释文件,翻译标签,list6_train.txt。 在你额外的两个注释文件中,似乎有关于从左到右哪条车道的注释。但是我没有这个注释,我只知道一张图片有多少条车道。如果我只是给他们从左到右范围里面的索引(0,这个img中的最大数据),然后生成翻译标签图,一切正常运行吗? 另外,LSTR 中的注释 0 1 1 1 1 0 0 和 0 0 1 1 1 1 0 之间有什么区别?

Hello, can i ask how do you label your tusimple dataset? What tools do you use? How to generate the JSON file required by tusimple because it requires isometric annotation. thank you for your reply !

Sorry for the late reply. I annoate the lane line as polygon, which is same as the method of instance segmentation. Then I fit each polygon to a curve line using least square fitting method. Finally, I'm sampling the points on the line with equally y-axis distance.

voldemortX commented 2 years ago

@August-en Thank you for sharing your labeling strategies!

voldemortX commented 2 years ago

非常感谢您的回复。 小鸟,我想在我的数据集上训练 LSTR。上面,我已经将我的注释格式与 Tusimple 文件格式进行了数据格式。但是我发现您的代码库中还有两个注释文件,翻译标签,list6_train.txt。 在你额外的两个注释文件中,似乎有关于从左到右哪条车道的注释。但是我没有这个注释,我只知道一张图片有多少条车道。如果我只是给他们从左到右范围里面的索引(0,这个img中的最大数据),然后生成翻译标签图,一切正常运行吗? 另外,LSTR 中的注释 0 1 1 1 1 0 0 和 0 0 1 1 1 1 0 之间有什么区别?

Hello, can i ask how do you label your tusimple dataset? What tools do you use? How to generate the JSON file required by tusimple because it requires isometric annotation. thank you for your reply !

Sorry for the late reply. I annoate the lane line as polygon, which is same as the method of instance segmentation. Then I fit each polygon to a curve line using least square fitting method. Finally, I'm sampling the points on the line with equally y-axis distance.

@dreamlover0 Here is the answer.

aspandey2021 commented 2 years ago

Hi @voldemortX, I am currently using your bezierlanenet model to try and train on my own dataset. I am trying to use initially Tusimple Data format and I have a two part question for you, namely:

1) Could you please explain how the numbering of Lane-ids in the 'list6_train.txt' e.g. is done? For my dataset, supposedly there are max. 5 lanes(6 lane markings) and in the current frame, I see only 3 lanes and my Ego vehicle is on the leftmost lane, how should I number the Ids?

Should it be like: 1 1 1 1 0 0 (3 lanes, ego vehicle on left most lane) and 0 0 1 1 1 1 (3 lanes, ego on the right most lane) ?

This Tusimple format seems to me to a bit unclear in comparison to the Culane format, where it is distinct that the Ids are simply lane markings from left to right.

2) The Culane *.txt format for the above lane markings is easier to understand and use in training. But I am unclear about the license for the Culane config and dataset format. Can I use the Culane dataset format to train your bezierlanenet model on a new dataset for commercial purposes? And also, do I need semantic labels, besides bezier labels for training If I use Culane dataset format and config files?

voldemortX commented 2 years ago

@aspandey2021 For your first question. It is common practice to number your ego lanes as 1 2, left right immediate as 3 4, and so on. If you are using a dataset purely within 4 lanes, 1234 from left to right may also work. This thing is flexible, as long as your network can learn the distinctions.

For your second question, the CULane dataset has been out there for almost 5 years, it should be okay to use it commercially, if you are unsure you can check the CULane official repo and raise an issue there. As for training BézierLaneNet, everything in this repo and related to the paper are considered free open-source techniques, and can be used for commercial purposes. If you are training BézierLaneNet, you don't need the semantic classification of lanes into different classes. However, you can find in this repo that all our txts are labeled with lane existence for consistency. You can simply append all 0 or all 1 for your own dataset, or you can not read them if you implemented a custom dataset Class for your custom dataset. Although BézierLaneNet does require a binary segmentation on CULane as auxiliary supervision. You could just classify all lane areas to the same class in the segmentation label. Without this auxillary loss, the performance can be notably worse on CULane, while on easier datasets such as TuSimple/LLAMAS, you can remove this auxiliary loss without hurting performance much.

aspandey2021 commented 2 years ago

Thank you so much for your prompt reply @voldemortX! So regarding my plan to use a custom dataset for training and regarding your answers to my questions, I would like to summarize a few points:

  1. I understood about the numbering strategy for the lane Ids as you mentioned that it is flexible. Thank you for that.

  2. As you mentioned that one doesn't need semantic classification for lane-existence in the txt file for Bezierlanenet, however I got an error when I didn't use these values of 1 or 0 in the txts (File not found error). So I think I would have to use these values anyhow (or is there a way in the code to completely ignore the list6.txt for Training?). My concern is to label and recognise the Highway exits and ramps also. And my dataset will have variable number of lanes (going upto 7-8 sometimes). So do you think that using a variable number of '1's in txt file,for each line present in a ground truth frame, would be a good strategy to train the model?

  3. Regarding your point that Bezierlanenet requires binary auxillary segmentation only for Culane, I tried without it for Tusimple (without segGT6 folder in ROOT) and it gave me an error (that masks could not be loaded). So I think this is also needed for a new custom dataset ?(please correct me if I am wrong)

  4. Finally for Inference, I tried an input image in 2 formats : format 1 (Tusimple with 1280x720) and format 2 (1920x1440). The predicted lane lines match quite good for Tusimple resolution but for the higher resolution, they appear very small in the image(due to the higher resolution I guess). Do you think this will improve when I train the model on format 2, because I want that the model is able to predict on different incoming resolutions in the end? Or is there a way to control that through the config file ?

I am sorry If I asked too much, thank you for your patience in advance :)!

voldemortX commented 2 years ago

@aspandey2021 Thanks for your interest in our works.

  1. As you mentioned that one doesn't need semantic classification for lane-existence in the txt file for Bezierlanenet, however I got an error when I didn't use these values of 1 or 0 in the txts (File not found error). So I think I would have to use these values anyhow (or is there a way in the code to completely ignore the list6.txt for Training?).

The dataset loading scheme is unfortunately coded-in. You will need to adjust that yourself. The easiest way without touching too much code will be changing the train.txt. For instance we split the image file name by space, you can add a space to every line or mod the code: https://github.com/voldemortX/pytorch-auto-drive/blob/9c62c0a721d11c8ea6bf312ecf1c7b238a54dcda/utils/datasets/lane_as_bezier.py#L101

Frankly speaking, for non-segmentation methods, you only need the filenames in train.txt, the 0 0 0 1 1 1 can be removed.

My concern is to label and recognize the Highway exits and ramps also. And my dataset will have variable number of lanes (going upto 7-8 sometimes). So do you think that using a variable number of '1's in txt file,for each line present in a ground truth frame, would be a good strategy to train the model?

If you are using BézierLaneNet, you only have to adjust max_lane in configs. It treats all lanes as the same class. For improving performance, you might want to add an classification of lane type, but that will require some efforts & coding.

  1. Regarding your point that Bezierlanenet requires binary auxillary segmentation only for Culane, I tried without it for Tusimple (without segGT6 folder in ROOT) and it gave me an error (that masks could not be loaded). So I think this is also needed for a new custom dataset ?(please correct me if I am wrong)

Yes you will need a new Dataset Class coded for this, which does not load seg labels. A new loss Class will also be needed that do not compute the seg loss. And you might want to also remove the seg branch in model config to save some training time. Although a simpler hack will be providing all 0 seg maps, and set the seg loss weight to 0.

  1. Finally for Inference, I tried an input image in 2 formats : format 1 (Tusimple with 1280x720) and format 2 (1920x1440). The predicted lane lines match quite good for Tusimple resolution but for the higher resolution, they appear very small in the image(due to the higher resolution I guess). Do you think this will improve when I train the model on format 2, because I want that the model is able to predict on different incoming resolutions in the end? Or is there a way to control that through the config file ?

The TuSimple models in this repo are all trained in 640x360, and will inference at 640x360, then rescale the results to your desired resolution. There are multi-scale augmentation in training for BézierLaneNet, so you don't really need to worry about this: https://github.com/voldemortX/pytorch-auto-drive/blob/9c62c0a721d11c8ea6bf312ecf1c7b238a54dcda/configs/lane_detection/common/datasets/train_level1b_360.py#L8

The reason for a smaller-looking result probably is in your config the desired resolution is set as 1280x720 here: https://github.com/voldemortX/pytorch-auto-drive/blob/9c62c0a721d11c8ea6bf312ecf1c7b238a54dcda/configs/lane_detection/bezierlanenet/resnet18_tusimple_aug1b.py#L51

If the resolution is correct (by setting the above or --height --width), then it could be a distribution shift. For instance, lanes taking up different portions of the image. In this case you might want to cut the image, so the vanishing point's coordinate appears in a similar proportion for the train & test images.