alexlee-gk / video_prediction

Stochastic Adversarial Video Prediction
https://alexlee-gk.github.io/video_prediction/
MIT License
303 stars 65 forks source link

Cannot restore model when attempting to run inference on different image sizes #25

Open elmanvon opened 5 years ago

elmanvon commented 5 years ago

Hi Alex, thank you so much for the code release.

I am trying to run inference on larger image size, e.g. 128x128 while the model was trained on images of size 64x64. But the model cannot be restored.

I have modified scripts/generate.py with one more args flag: infer_read_pics, which will change the placeholder shape to 128x128 for inference. The data being fed into the placeholder is also resized to 128x128:

... omited code ...

    if args.infer_read_pics:
        inputs = None
        input_phs = {'images': tf.placeholder(dtype=tf.float32, shape=[1, model.hparams.sequence_length, IMG_H, IMG_W, 1], name='images_ph')}
    else:
        inputs = dataset.make_batch(args.batch_size)
        input_phs = {k: tf.placeholder(v.dtype, v.shape, '%s_ph' % k) for k, v in inputs.items()}
    with tf.variable_scope(''):
        model.build_graph(input_phs)

... omitted code...

    while True:
        if args.num_samples and sample_ind >= args.num_samples:
            break

        try:
            if sample_ind > 0:
                break

            if args.infer_read_pics:
                glob_pattern = '/home/von/repo/video_prediction/data/kth/infer_dir' + '/context_image_*.png'
                img_paths = glob.glob(glob_pattern, recursive=True)
                ipaths = sorted(img_paths)
                imgs = skimage.io.imread_collection(ipaths)
                imgs = [(lambda img: cv2.resize(img, dsize=(IMG_H, IMG_W), interpolation=cv2.INTER_CUBIC))(img[..., 0:1]) for img in imgs]
                imgs = np.expand_dims(np.stack(imgs), axis=0)  # (1, 11, 64, 64, 1)
                od = OrderedDict()
                od['images'] = imgs / 255.0
                input_results = od
            else:
                input_results = sess.run(inputs)
        except tf.errors.OutOfRangeError:
            break

I have also commented below code inside video_prediction/models/savp_model.py to make sure the model architecture is the same as the model trained using 64x64 images:

        elif scale_size >= 128:
            self.encoder_layer_specs = [
                (self.hparams.ngf, False),
                (self.hparams.ngf * 2, True),
                (self.hparams.ngf * 4, True),
                (self.hparams.ngf * 8, True),
            ]
            self.decoder_layer_specs = [
                (self.hparams.ngf * 8, True),
                (self.hparams.ngf * 4, True),
                (self.hparams.ngf * 2, False),
                (self.hparams.ngf, False),
            ]

But the model cannot be restored due to the error below:

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [32768,100] rhs shape= [8192,100]
         [[node save/Assign_15 (defined at /home/von/repo/video_prediction/video_prediction/utils/tf_utils.py:542)  = Assign[T=DT_FLOAT, _grappler_relax_allocator_constraints=true, use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](generator/rnn/savp_cell/cdna_kernels/dense/kernel, save/RestoreV2/_15)]]

According to the error, I have traced to the this line in video_prediction/ops.py: kernel_shape = [input_shape[1], units].

Do you have any suggestions about how to make it work for running inference on arbitrary image sizes?

Thanks for reading my question!

malalejandra commented 5 years ago

Hi @elmanvon , did you manage to solve it?