Open sam598 opened 5 years ago
Thanks @sam598 -- interesting use case, I've never tried to do that many images at once before! Have you tried linearly interpolating between some of your images to cut down on the number of frames, or encoding a keyframe first and then copying that dlatent to use as an initial value for the rest of the frames?
Oh, also, the above commit should fix your issue!
Thanks for adding that so fast @pbaylies!
It seems that there is still a memory leak somewhere, 100 iterations goes from 24s to 90s and crashes around the 70th image. Is there anything else that could use placeholders? Or anything else that could leak? Performance is definitely better though.
I have tried several experiments with encoding initial keyframes, as well as keeping the previous frame as the initial value for the next one. Something very interesting happens where facial features begin to be "burned in". Even if the first frame has 500 iterations and every subsequent frame is only 50, the head pose and facial structure begin to get stuck, and higher level features like reflections and hair begin to "seep" down to lower layers and affect the structure.
The best results I have gotten so far have been from encoding each frame from scratch, and then temporally smoothing them. I really need to do a writeup of these tests.
I've added enough features that I'm sure there are some leaks, the whole thing is due for a more careful rewrite at this point. There are also a bunch of parameters you can tweak, and the different parts of the loss function can be tuned or turned off. I will see what I can do; and, patches welcome, of course!
The same problem also happened in my case, when trying to encode a large number of images in 1024*1024 resolution. It even runs out of memory in 16g memory tesla v100 gpu with batch size1. I tried this (https://github.com/Puzer/stylegan-encoder/pull/4) but the problem remains
@SystemErrorWang yes, this still needs improvement; currently feel free to use larger batch sizes if you like, but see if you can work around this by running the tool itself on smaller batches of images at a time.
Hello, is there any way to fix this bug?
@minha12 @pbaylies
The original function, optimize
in perceptual_model.py should be separated into the initialization and run_optimizer two steps. It will initialize just one optimizer graph for all images.
def init_optimizer(self, vars_to_optimize, iterations=200):
self.vars_to_optimize = vars_to_optimize if isinstance(vars_to_optimize, list) else [vars_to_optimize]
if self.use_optimizer == 'lbfgs':
self.optimizer = tf.contrib.opt.ScipyOptimizerInterface(
self.loss, var_list=self.vars_to_optimize, method='L-BFGS-B', options={'maxiter': iterations})
else:
if self.use_optimizer == 'ggt':
self.optimizer = tf.contrib.opt.GGTOptimizer(learning_rate=self.learning_rate)
else:
self.optimizer = tf.train.AdamOptimizer(learning_rate=self.learning_rate)
min_op = self.optimizer.minimize(self.loss, var_list=[self.vars_to_optimize])
self.sess.run(tf.variables_initializer(self.optimizer.variables()))
self.fetch_ops = [min_op, self.loss, self.learning_rate]
self.sess.run(self._reset_global_step)
def run_optimizer(self, iterations=200):
for _ in range(iterations):
if self.use_optimizer == 'lbfgs':
self.optimizer.minimize(self.sess, fetches=[self.vars_to_optimize, self.loss])
yield {"loss": self.loss.eval()}
else:
_, loss, lr = self.sess.run(self.fetch_ops)
yield {"loss": loss, "lr": lr}
Hello, Is there a way to fix this bug?
When encoding a large number of images the encoding time will slowly increase until it becomes 2x-3x the time it took to encode the first image, then the script encode_images.py will crash. On my system it always crashes on the 56th image.
The culprit appears to be these lines in perceptual_model.py
self.sess.run(tf.assign(self.features_weight, weight_mask)) self.sess.run(tf.assign(self.ref_img_features, image_features)) self.sess.run(tf.assign(self.ref_weight, image_mask)) self.sess.run(tf.assign(self.ref_img, loaded_image))
I posted a pull request on Puzer's original stylegan-encoder: https://github.com/Puzer/stylegan-encoder/pull/4
but I'm not familiar enough with your changes to know how to fix it. There is more information here: https://github.com/Puzer/stylegan-encoder/issues/3
The changes you have made and collected are a fantastic step forward and actually make frame to frame stylegan animations possible. A fix for this bug would go a long way to helping encode image sequences.