When will video feature work?

CompVis / adaptive-style-transfer

source code for the ECCV18 paper A Style-Aware Content Loss for Real-time HD Style Transfer

https://compvis.github.io/adaptive-style-transfer/

GNU General Public License v3.0

730 stars 140 forks source link

When will video feature work? #7

Closed Aeroxander closed 5 years ago

Aeroxander commented 6 years ago

Don't know if I'm allowed to make an issue out of this but I'd like to know if the video feature will work anytime soon, if there is anything (simple) I can help with I'd be glad to!

asanakoy commented 6 years ago

You can easily implement it yourself. Just process each frame of the video independently and then stitch everything back to a video. If you want to make a pull request - you are welcome!

Aeroxander commented 6 years ago

Ah ok :) I doubt it will look smooth, probably need to add an optical flow, but I'll try the stitching first!

asanakoy commented 6 years ago

We didn't use any temporal smoothing in the paper. Each frame was processed independently.

Aeroxander commented 6 years ago

I could try to implement this: https://medium.com/element-ai-research-lab/stabilizing-neural-style-transfer-for-video-62675e203e42 B But the stabilization will be implemented at training time.

I think it would be nicer to do it during the stitching of the frames with optical flow described here: https://link.springer.com/article/10.1007%2Fs11263-018-1089-z used in https://github.com/manuelruder/fast-artistic-videos

The first option could technically degrade the image quality when you only stylize a photo and the second option could degrade the quality of a video, but because the video is running anyway it's probably less worse if I go the optical flow route.

Just saw that the video in your paper looks pretty smooth (at least not flickering), so I don't know if optical flow would make THAT much of a difference, but I'm a perfectionist so I'll go for it anyway :)

edeetee commented 6 years ago

Trying to do this kinda thing for a uni project. When I get it working I'll see if I can make it pushable :)

kaisark commented 5 years ago

@Alexlander I think short-term workaround is to use ffmpeg to split the video into image frames and process the image frames directory and then reassemble the output into video using ffmpeg again.

split: ffmpeg -i kktie.mp4 -r 25 -f image2 image-%04d.png

CUDA_VISIBLE_DEVICES=0 python main.py \ --model_name=model_van-gogh \ --phase=inference \ --image_size=1280 \ --ii_dir input/ \ --save_dir=output/

reassemble: ffmpeg -i image-%04d_stylized.jpg kktie-out.mp4

computervision gif-downsized_large 1 -adaptivestyle-vg-kktie

Here is the output with model_munch (Edvard Munch model must be a grouping and not just Scream) focusedunripedingo-size_restricted

asanakoy commented 5 years ago

Yes, Munch is not just "Scream". It's a collection with "Scream" as a query image.

asanakoy commented 5 years ago

I have added steps to generate video in README .