Runs out of memory and crashes when encoding an image sequence.

sam598 commented 5 years ago

When encoding a large number of images the encoding time will slowly increase until it becomes 2x-3x the time it took to encode the first image, then the script encode_images.py will crash. On my system it always crashes on the 56th image.

The culprit appears to be these lines in perceptual_model.py

self.sess.run(tf.assign(self.features_weight, weight_mask)) self.sess.run(tf.assign(self.ref_img_features, image_features)) self.sess.run(tf.assign(self.ref_weight, image_mask)) self.sess.run(tf.assign(self.ref_img, loaded_image))

I posted a pull request on Puzer's original stylegan-encoder: https://github.com/Puzer/stylegan-encoder/pull/4

but I'm not familiar enough with your changes to know how to fix it. There is more information here: https://github.com/Puzer/stylegan-encoder/issues/3

The changes you have made and collected are a fantastic step forward and actually make frame to frame stylegan animations possible. A fix for this bug would go a long way to helping encode image sequences.

pbaylies commented 5 years ago

Thanks @sam598 -- interesting use case, I've never tried to do that many images at once before! Have you tried linearly interpolating between some of your images to cut down on the number of frames, or encoding a keyframe first and then copying that dlatent to use as an initial value for the rest of the frames?

Oh, also, the above commit should fix your issue!

sam598 commented 5 years ago

Thanks for adding that so fast @pbaylies!

It seems that there is still a memory leak somewhere, 100 iterations goes from 24s to 90s and crashes around the 70th image. Is there anything else that could use placeholders? Or anything else that could leak? Performance is definitely better though.

I have tried several experiments with encoding initial keyframes, as well as keeping the previous frame as the initial value for the next one. Something very interesting happens where facial features begin to be "burned in". Even if the first frame has 500 iterations and every subsequent frame is only 50, the head pose and facial structure begin to get stuck, and higher level features like reflections and hair begin to "seep" down to lower layers and affect the structure.

The best results I have gotten so far have been from encoding each frame from scratch, and then temporally smoothing them. I really need to do a writeup of these tests.

pbaylies commented 5 years ago

I've added enough features that I'm sure there are some leaks, the whole thing is due for a more careful rewrite at this point. There are also a bunch of parameters you can tweak, and the different parts of the loss function can be tuned or turned off. I will see what I can do; and, patches welcome, of course!

SystemErrorWang commented 5 years ago

The same problem also happened in my case, when trying to encode a large number of images in 1024*1024 resolution. It even runs out of memory in 16g memory tesla v100 gpu with batch size1. I tried this (https://github.com/Puzer/stylegan-encoder/pull/4) but the problem remains

pbaylies commented 5 years ago

@SystemErrorWang yes, this still needs improvement; currently feel free to use larger batch sizes if you like, but see if you can work around this by running the tool itself on smaller batches of images at a time.

minha12 commented 4 years ago

Hello, is there any way to fix this bug?

ChengBinJin commented 4 years ago

@minha12 @pbaylies The original function, optimize in perceptual_model.py should be separated into the initialization and run_optimizer two steps. It will initialize just one optimizer graph for all images.

    def init_optimizer(self, vars_to_optimize, iterations=200):
        self.vars_to_optimize = vars_to_optimize if isinstance(vars_to_optimize, list) else [vars_to_optimize]

        if self.use_optimizer == 'lbfgs':
            self.optimizer = tf.contrib.opt.ScipyOptimizerInterface(
                self.loss, var_list=self.vars_to_optimize, method='L-BFGS-B', options={'maxiter': iterations})
        else:
            if self.use_optimizer == 'ggt':
                self.optimizer = tf.contrib.opt.GGTOptimizer(learning_rate=self.learning_rate)
            else:
                self.optimizer = tf.train.AdamOptimizer(learning_rate=self.learning_rate)
            min_op = self.optimizer.minimize(self.loss, var_list=[self.vars_to_optimize])
            self.sess.run(tf.variables_initializer(self.optimizer.variables()))
            self.fetch_ops = [min_op, self.loss, self.learning_rate]

        self.sess.run(self._reset_global_step)

    def run_optimizer(self, iterations=200):
        for _ in range(iterations):
            if self.use_optimizer == 'lbfgs':
                self.optimizer.minimize(self.sess, fetches=[self.vars_to_optimize, self.loss])
                yield {"loss": self.loss.eval()}
            else:
                _, loss, lr = self.sess.run(self.fetch_ops)
                yield {"loss": loss, "lr": lr}

Akila-Ayanthi commented 3 years ago

Hello, Is there a way to fix this bug?

pbaylies / stylegan-encoder

Runs out of memory and crashes when encoding an image sequence. #6