TuSimple / TuSimple-DUC

Understanding Convolution for Semantic Segmentation
https://arxiv.org/abs/1702.08502
Apache License 2.0
605 stars 118 forks source link

Training data list for Cityscapes #6

Closed chienyiwang closed 6 years ago

chienyiwang commented 6 years ago

Hi,

Thanks for the effort for making the training code publicly available. I would like to try training the cityscapes model from scratch by your code. I saw on the paper that you used some data augmentation trick to enlarge the number of training images. Could you provide the augmented data you used and some instruction for generating the data list? Thank you very much!

GrassSunFlower commented 6 years ago

see #2.

chienyiwang commented 6 years ago

Hi @GrassSunFlower Thanks for the reply. However, the list only provides the image/annotation filename in each line. Would you be able to provide the download link of actual augmented training data for easier training? Thank you!

GrassSunFlower commented 6 years ago

Maybe you should read the code more carefully, we assign different anchors in the data_prep/get_cityscapes_list.py:

for i in range(1, 8):
    train_lst.append([str(index), p, l, "512", str(256 * i)])

And together with TuSimple-DUC/tusimple_duc/core/utils.py

def get_single_image_duc(item, input_args):

That's exactly how we do data augmentation.

chienyiwang commented 6 years ago

Hi @GrassSunFlower, I see. Sorry I presumed you are extracting 35700 images into a separate folder which follows the usual segmentation data pipeline. Thanks for the explanation!

chienyiwang commented 6 years ago

Hi @GrassSunFlower , could I ask some questions related to the gen_cityscapes_list.py code?

  1. What is the usage of the following code segment if index % sample_rate != 2: continue
  2. What is the reason for printing out the same line twice (from line 28-30)?
    for line in train_lst:
        print >> train_out, '\t'.join(line)
        print >> train_out, '\t'.join(line)

Thank you very much!

GrassSunFlower commented 6 years ago

These two statements should be two bugs. Thanks for your patience. I'll fix them shortly.

GrassSunFlower commented 6 years ago

@chienyiwang fix by #7

shipengai commented 6 years ago

Hi,@chienyiwang Have you trained the model on the Cityscapes? what's the performance your trained model on the val dataset?

shipengai commented 6 years ago

Hi ,@GrassSunFlower @wpqmanu Is there a error in data_prep/get_cityscapes_list.py. the train lst has only 20825 lines, not 35700. because of range(1,8) = [1,2,3,4,5,6,7]

GrassSunFlower commented 6 years ago

Nope. 35700 should came from our paper saying that

 Since the image size in the Cityscapes dataset
is 1024 × 2048, which is too big to fit in the GPU memory,
we partition each image into twelve 800×800 patches with
partial overlapping, thus augmenting the training set to have
35700 images.``` which is in the 'baseline model'.

And our actual augmentation utilized is in 'Bigger Patch Size', which says

 Since the patch size
exceeds the maximum dimension (800 × 800) in the previous
12-fold data augmentation framework, we adopt a new
7-fold data augmentation strategy: seven center locations
with x = 512, y = {256, 512, ..., 1792} are set in the original
image;

which is 7*2975. Hope this will help you out. @shipeng-uestc

shipengai commented 6 years ago

@GrassSunFlower Thanks for your reply. I figure out it. I have another question. Can I use directly init.param to train on cityscape fine annotation dataset and need not to change super-paramter i.e. batch_size learning rate and so on.

GrassSunFlower commented 6 years ago

Sure you do.