sampepose / flownet2-tf

FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks
MIT License
404 stars 195 forks source link

Problem with data augmentation #14

Open junfanlin opened 6 years ago

junfanlin commented 6 years ago

Hi, Sampepose, Thanks for your great job. While I running the code, I found that it's very difficult for the model with augmentation process to converge. The training loss and test loss is very large(about 50). And if I block out the preprocess code, the model converge very fast, the training loss is about 2. Then I realize that if I set the 'scale' option in dataset_config to be True, the tensorboard can show correct image, while if I set it to be False, the tensorboard failed to show correct image. image

What's more, if I choose to only do the 'translate', 'rotate' and 'zoom' operations, no matter how the 'scale' option is, the tensorboard can show the correct image.

So, I'm wondering if there is anything about the augmentation process should I pay attention to? And would you like put your converging curve here, so can I make sure I'm doing the same training as yours.

My system is ubuntu14.04, and gpu is TitanXP, and I compile your code with gpu compatibility as sm=61.

Expecting your reply. Thanks in advance!

13331151 commented 6 years ago

And would you mind put your code of date generation? I found that using your generated tfrecord('sample'), the tensorboard can show normal image but mine generated tfrecord cannot.

below is my code

    compression = tf.python_io.TFRecordCompressionType.ZLIB
    writer = tf.python_io.TFRecordWriter(filename, options=tf.python_io.TFRecordOptions(compression))

    for index in xrange(len(image1s)):
        if index % 1000 == 0:
            print index
        image1 = cv2.imread(image1s[index])
        image2 = cv2.imread(image2s[index])
        _, label = parsingFlo(labels[index])
        image_a = np.float64(image1).tobytes()
        image_b = np.float64(image2).tobytes()
        flow = np.float32(label).tobytes()
        example = tf.train.Example(features=tf.train.Features(feature={
            "image_a":tf.train.Feature(bytes_list=tf.train.BytesList(value=[image_a])),
            "image_b": tf.train.Feature(bytes_list=tf.train.BytesList(value=[image_b])),
            'flow': tf.train.Feature(bytes_list=tf.train.BytesList(value=[flow])),
        }))
        writer.write(example.SerializeToString())  # Serialize To String
    writer.close()

@sampepose THX!

yinjunbo commented 6 years ago

@junfanlin HELLO! I met the same problem as you. It seems that, the model can only learn the background information with the augmentation process . The LOSS of flownet_s is about 50. I'd like to consult you how to solve this problem. Thank you~

yeshenlin commented 6 years ago

@junfanlin hi, I meet the same problem, the loss is about 40.... Have you solved the problem ? Thank you

junfanlin commented 6 years ago

@yeshenlin Hi, I didn't solve it. If you are familiar with pytorch, I suggest you to take a look at NVIDIA's implemenation and refer the data augmentation part . best regard.

dokhanh commented 5 years ago

Hello,

The "Scale" option in dataset_configs needs to be True if in the preprocessing phase you need to scale from pixel values (an integer between 0 and 255) to [0, 1] (so basically image = image/255.0). This operation has been done in the script that converts and puts data images to tfrecord files (see scripts/convert_fc_to_tfrecords.py). But if you haven't done this way, you need to put scale = True to do it while training.