jcjohnson / fast-neural-style

Feedforward style transfer
4.28k stars 813 forks source link

would it be feasible to implement the -seed arg in fast_neural_style? #84

Open tkoham opened 7 years ago

tkoham commented 7 years ago

I'm working on video processing with fast_neural_style, and I've explored various methods of reducing noise and variance frame to frame, with mixed results. The biggest problem with using an external solution is that it slows down batch rendering to a crawl because it has to restart the process on each frame. the -seed argument used in slow_neural_style seems to be a good solution, but before I try to implement that function in the fast one, I'd like to know if it's just a wild goose-chase because I only have a basic working knowledge of lua.

PS: It also occurs to me that I could write a quick-n-dirty script using pillow, g'mic or similar that would suspend the process after each frame, blend the output, and resume, but fast_neural_style doesn't do batch processing in a linear, counting up fashion. if I could force this, it would achieve a similar result, but again, I'm not sure where to start.

jcjohnson commented 7 years ago

The fast and slow methods work in very different ways, and there is no way to use the -seed option to reduce noise in the fast version.

In the slow version, the image is initialized with random noise, and updated iteratively in order to produce the final stylized image. In this case, using the same random seed for different frames means that the same random noise is used to initialize each frame.

In the fast version, the content image is passed through a feedforward network which directly outputs the stylized image. There is no randomness in the process, so setting the random seed will not affect the output of the network.

Finding the best way to eliminate noise and frame variance from frame to frame in fast style transfer is a bit of an open research problem at this point; some kind of post-processing solution may be your best option right now.

tkoham commented 7 years ago

I figured as much. what about forcing the batch processing to work in alphabetical/numerical order? would that be relatively uncomplicated? if I can get that done, then I can pause the process, and keep everything in memory while I do a blend pass on each frame. it seems my only other option would be to incorporate image processing into the style transfer script.

For example, I'm using a non-interactive python script that counts the number of frames in a directory, and then runs fast_neural_style on each frame individually, doing a blending pass with the next input frame in the sequence. this gets rid of noise and variance quite nicely, but because I'm forced to re-start the process on each frame, it increases the processing time between frames dramatically.

If I could rely on the batch processing to do each image counting up from 0, then I could watch for the output "Writing output image to /home/Git/fast-neural-style/XYZ/00000001.png," suspend the process, blend the next input frame with the output, and then resume, keeping the style transfer stuff in gpu memory. because it just sort of shotgun processes images in a directory at random, though, I can't take this approach.