dthpham / sminterpolate

Make motion interpolated and fluid slow motion videos from the command line.
MIT License
1.39k stars 91 forks source link

Butterflow adds duplicate frames between video segments #13

Open dandelany opened 8 years ago

dandelany commented 8 years ago

Hi dthpham,

First off, thanks for making Butterflow - it's pretty awesome. I've been using it on satellite photos from Himawari 8 with great results. The only other open source tool I've found that is similar is slowMoVideo, and your Farneback algorithm seems to perform a lot better than their Kanade-Lucas algorithm, at least for my purposes - here's a test I did to compare them.

Anyway, I've run into a bit of a problem - As you can see a few times in that example, there are a few missing images which cause some distracting time shifts. Normally the satellite takes images once every 10 minutes, but it misses a few observations every day in order to do 'housekeeping' - ie. checking itself out, reporting diagnostic data to Earth, making sure it is in the right orbit, etc. Here's another example in gif form, the original images at 2fps:

original images at 2 FPS

It's not super obvious here, but there are 4 missing frames in this sequence - a set of 2 near the beginning, then 1, then another near the end. Here's the result when I Butterflow interpolate it to 30fps. Everything looks great, but the missing frames become more apparent, since the now-smooth video appears to speed up during these segments:

butterflow interpolation of original, not adjusted for missing frames

OK, no problem, I thought. Butterflow has an option to have different segments interpolated at different speeds, so I wrote a little script that parses through the image timestamps and searches for gaps, then generates a butterflow command to interpolate them at the right speeds. So if we are normally interpolating to 0.05x speed but there is one image missing, that segment would get interpolated at 0.025x instead, or at (0.05/n)x when n images are missing. Cool, it works, and gives me something like the following:

butterflow -l -r 30 -o time-adjusted.mp4 -s \
a=0,b=0.36666666666666664,spd=0.05:\
a=0.36666666666666664,b=0.4,spd=0.016666666666666666:\
a=0.4,b=0.5,spd=0.05:\
a=0.5,b=0.5333333333333333,spd=0.025:\
a=0.5333333333333333,b=0.8666666666666667,spd=0.05:\
a=0.8666666666666667,b=0.9,spd=0.025:\
a=0.9,b=end,spd=0.05 \

However, here's the problem - when I actually run this command, Butterflow seems to create some duplicate frames between the segments which cause the video to pause and be even less smooth than before. Here's the result:

butterflow interpolation with time adjustment, butterflow added duplicate frames

I've tried it a few different ways but I always seem to get these pauses, which seems to be a bug. However, I watched the preview window closely during the video creation and it does seem to correctly apply the time shifts at the right parts, they are just broken up by the pauses. This allowed me to find a workaround which seems to work for now to remove the duplicate frames:

If I do all of that, I get a nice beautiful result which nearly matches what I'd expect (although the shifts are still seen a little bit):

butterflow interpolation with time adjustment and duplicate frames removed

However, it would be nice not to have to do this in my pipeline if Butterflow handled it better.

I've just noticed something else which may be the root cause of this bug - Butterflow seems to add extra frames at the end of all videos to satisfy the "speed" parameter, which I think is incorrect. For example, lets say I have a 1 second video which I want to butterflow to 0.1x speed - I would naively expect the result to be 10 seconds long, and indeed, that's what Butterflow gives me. But let's say I have a video only 3 frames long. Butterflow returns 30 frames - but the last 9 frames are just duplicates of the last real frame. I think the correct result is actually to return a video with 22 frames - 1 real, 9 interpolated, 1 real, 9 interpolated, 1 real - even though 22 is not truly 3/0.1.

Anyway, apologies for the giant ticket, and thanks again :) Hope it's helpful.

-d

dandelany commented 8 years ago

Oh, and if you want to try it for yourself, here are higher quality versions of the images I used to make the videos above.

dthpham commented 8 years ago

Thanks for sharing this. It's really cool to see what people are using Butterflow for.

OK, so why does Butterflow dupe frames at the end of videos and is it "correct"?

First you have to consider that Butterflow only supports constant frame rate encoding (for the time being). This means that FFmpeg expects a certain number of frames every second when it's encoding the video. Dropping and adding expected frames would cause the video to change in speed and duration.

When dealing with video it's reasonable to assume that audio will also be present. When there is the expectation that the original audio will be muxed with the interpolated video, having the correct speed and duration is central to ensure that sound is synchronized with the frames in the video.

Padding with duplicate frames at the end of videos is necessary to make sure that any audio that is being remixed into the final video isn't unexpectedly cut off.

For example, let's assume that we have a video of 3 frames A, B, C playing at 1 fps with an audio channel where a continuous tone can be heard for the entire 3 seconds of the video.

If we have Butterflow increase the frame rate to 30 fps while maintaining duration (this is key), 87 frames will be interpolated from 3 source frames, resulting in a 3 second video with these frames:

A, AB1, AB2, AB3 ... AB29, B, BC1, BC2, BC3 ... BC29, C, C1, C2, C3 ... C29

If frames aren't duped at the end (C1 to C29 were completely discarded) there would only be 61 frames. When encoded at a constant frame rate of 30 fps, the duration of this video would be 61 frames / 30 fps = 2.03 seconds. 1 second of audio would be cut off from the end of the video when muxed with the original 3 second audio.

To a user working with videos w/ sound, abrupt audio cutoff like that, where 1/3 of the audio is missing, is unexpected behavior.

To me, being stringent with speed and making sure that all resulting videos have the correct (naively assumed) duration is a good general policy.

That's not to say it's the best for all cases. When users are only working with videos w/o audio and don't care if the speed or duration is altered and when they are only concerned with producing the smoothest videos possible, excluding duplicate frames from output videos isn't a bad idea.

So anyways, I just wanted to put this out here first because I didn't want to leave you hanging.

I'm still giving this whole thing some thought. I will try to get back to you on the topic of duplicate frames between video segments very soon.

dthpham commented 8 years ago

By the way, the duplicated frames at the end of video segments is a legitimate issue. The problem is that Butterflow treats each segment as if it was a self contained video. A segment's last frame is duplicated to satisfy the expected speed and duration but instead of padding with duplicated frames of the current segment's last real frame, Butterflow should be interpolating the last frame with the first frame of the next segment if it exists.

This would make one segment flow/merge into the next segment more smoothly.

So in all, I think these are some good changes to adopt that would fix / handle concerns in regards to this whole issue with dupes, expected duration, and A/V synchronization:

-st, --smooth-transition "Set to optimize for smooth transitions between video sub regions. This may shift the starting and ending points between each region by one or more frames."

-npad, --no-padding "Set to discard duplicate frames that are padded to the end of the output video."

-m, --mux "Set to mux original audio and subtitles with the output video. Audio and subtitles may be truncated or may not be in sync with the video because of potential differences in duration."

-dt, --detelecine "Set to perform a basic inverse telecine on the input video"

dthpham commented 8 years ago

I'm reneging on this:

The problem is that Butterflow treats each segment as if it was a self contained video. A segment's last frame is duplicated to satisfy the expected speed and duration but instead of padding with duplicated frames of the current segment's last real frame, Butterflow should be interpolating the last frame with the first frame of the next segment if it exists.

I was assuming that the last frame of every segment is being duped because it didn't have another frame to be interpolated with but it's not true for every case. I have to look into this further.

I'm going to keep things the way they are for now but will still add the -np, --no-padding which will drop duplicate frames that are padded at the end of all video segments.

dthpham commented 8 years ago

This is what I'm getting when dupes padded to the end of segments are removed using a new -npad, --no-padding option. The pull request: https://github.com/dthpham/butterflow/pull/14. Mine is on the left and yours is on the right:

This might stutter in your browser. You can watch the full sized video here.

dandelany commented 8 years ago

Thanks very much for the detailed follow ups! I am a bit slow to respond because I'm traveling this week, but definitely still reading all your comments.

I hadn't considered the audio issue - makes sense to me now why --no-padding would not be the default. All of your recommended changes seem totally reasonable to me, as does the pull request.

The video you posted with --no-padding looks great & quite similar to my best result. It is interesting that in both cases the time shift is still a bit noticeable if you're looking for it. Maybe this is just due to the fact that the quality of the additional interpolated frames is noticeably different. Or is the last frame of each segment skipping to the first frame of the next without interpolation between segments? I'm not able to do a full analysis at the moment to check.

dthpham commented 8 years ago

I think the problem is because too many frames were being dropped. It's that large change in speed in a short period of time that makes shifting noticeable.

One thing that I just tried was setting the variable that controls how many frames are interpolated between pairs to a high enough value to interpolate as many frames as possible (slightly more than what is expected in the segment).

In theory, over compensating this way should mean that:

  1. Butterflow won't have to pad with dupes of real frames because it can just use the extra interpolated frames in their place
  2. It will continue to drop extra frames but will still have enough for what is expected in every segment

This should ease up rapid shifting and stuttering while still creating enough frames to reach the target duration.

dthpham commented 8 years ago

I'm getting really good results now after implementing what I said in my last post. Squashed a lot of bugs in the process while debugging this thing. The cool thing is that I didn't have to use the -npad option for it:

You can see the full sized video here (compressed).

By the way, about your question on whether the last frame of each segment was skipping to the first frame of the next without interpolation between segment - this is what I thought was happening before but as long as there is more than 1 frame in a segment, every frame, except for the last frame in a video, gets interpolated with a following frame regardless of whether they are in the same segment or not.

For example, this is how Butterflow is segmenting the video using the command from the first post:

Rendering sequence:
subregion: 1,11
subregion: 11,12
subregion: 12,15
subregion: 15,16
subregion: 16,26
subregion: 26,27
subregion: 27,33

Butterflow was interpolating frames using pairs of frames from 1 to 11 (1,2, 2,3, 3,4 ... 9,10, 10,11) in segment 1, then from 11 to 12 (11,12) in segment 2, then 12 to 15 (12,13, 13,14, 14,15) in segment 3 and so on.

So I'm pretty sure that no frame is being skipped without interpolating with another. At least this is how it is with my new code.

dandelany commented 8 years ago

Wow, that is a great result - the segments aren't noticeable at all anymore! And what you are saying about subregions makes sense and matches what I'd expect. I noticed you pulled the fixes into master already - let me know when you put a new release on homebrew, I'm excited to try it out.

A few sidenotes:

Thanks as always!

dthpham commented 8 years ago

I pushed some commits to master this morning that fixes bugs that may have been preventing you from compiling and I updated the Install From Source Guide with better instructions for OS X - so I hope that helps.

Getting Butterflow to build on Ubuntu has been on my TODO list for a while now. I'm going to give it another go sometime this evening.

dandelany commented 8 years ago

Thanks for the extra instructions - I was able to get it to build on my OS X machine with no problems, and successfully replicated the excellent result from your previous comment.

FWIW I've decided to port my scripts to shelljs to make them platform-independent. So either an Ubuntu or Windows build would be equally awesome.

dandelany commented 8 years ago

I've been getting my scripts for making videos working a lot better, and rendered several videos last night with the new Butterflow code. Here are the results from a cropped view of 5 days worth of data - make sure you set it to 1080p quality. I'm really impressed with the quality and smoothness. Each of those 90 second videos takes about an hour and a half to render (at 60fps) on my Macbook Pro - hopefully my beefier desktop machine will be significantly faster once I get the code running on there.

dandelany commented 8 years ago

[ edit: unrelated issue moved to #16 ]

dthpham commented 8 years ago

Those videos look pretty good!

I'm a little concerned that it's not the best it could be because I've been noticing in a lot of videos that I work with that Butterflow will drop real frames instead of interpolated frames when it has to (when drp_every is greater than 0). You might notice real frames being dropped in the verbose output whenever you see something like drp: A,B,1. Butterflow could be optimized to keep source frames instead and that's something that I'm looking to fix very soon.

I also wonder if the fading in and out of black at the start and end of videos could be a little smoother. I don't know if you've noticed but it's kinda jumpy there.

Anyways, I've had some success on getting Butterflow to work on Ubuntu. See my latest post here: #6. You will see a big improvement in rendering speed if you are able to get BF working on a desktop with a decent GPU.

Lastly, can you open a new ticket for that time skipping issue? I'm going to check it out over the weekend. Thanks.

dandelany commented 8 years ago

Thanks! I have seen the dropped frames before, but interestingly I haven't seen them much in my latest batch of renders. I decided on rendering a 2fps original video to 60fps at 0.6667x speed (+ multiples of that in the segments accounting for missing frames), an arbitrary choice based on the fact that the 0.5x ones I put on Youtube felt a little too slow. For whatever reason, this speed seems to work out nicely and I don't see too many drp messages in the verbose logs. Here's some verbose output from a recent render, if it's useful.

The fading in and out part is the terminator (line between day and night). These times (local morning and evening) are inherently the hardest because the lighting is changing rapidly, both in brightness and in sun angle. The low sun angle makes small elevation details in clouds and land suddenly pop out due to the long shadows they cast. So I'm certainly not surprised that this is when most of the artifacts appear, and I specifically chose Butterflow because it already seems to be way better with these details than anything else I tried :)

That said, you're probably right that there's some improvement to be had. In particular, the jumpy chopped-up-cloud-bits artifacts are always worst in the last moments of the video when the last few lit details at the edges don't have any surrounding context to sample. Currently, I'm cropping the images first, then making the video - but the holy grail would be to make the video first and Butterflow it at full (5500x5500) resolution, then crop the video into smaller videos. This would presumably improve the interpolation near the edges, because the optical flow calculation could take features into account that are beyond the edge of the final, cropped video. Not sure how big of a difference this would make, but I haven't dared try it yet on my current setup, because I know it would take all night and then some :) Maybe a good compromise would be to crop the images a few hundred pixels larger than the intended final size, then interpolate, then slice off the extra borders.

There's also another type of artifact with the terminator, that is more noticeable at lower latitudes where the apparent terminator is wider and therefore more of a subtle gradient. The fading looks pretty good on any given frame, but in the video, the original frames are clearly noticeable and the terminator gradient seems to move in segments. I'll post another video near the equator soon that shows this more clearly. It would be awesome to improve this, but I honestly doubt it's possible, since IIUC the optical flow algorithm only takes two frames into account, and I have a hunch you'd need to use data from more surrounding frames to improve this gradual fading. But I'd be happy to be proven wrong.

Great to hear that Ubuntu progress has been made, I'm planning to try it this weekend. I got a chance to try a render on my work laptop, which is a 2015 Macbook Pro (vs. my home 2013 MBP), and was impressed to see a 3x improvement in rendering speed. I also realized I've been running it on files which are on an external HD when I have a local SSD. Not sure how often I'm I/O bound, but I'll probably try copying it to the SSD before Butterflowing, it may be faster.

Opened #16 for the time skip issue. Thanks as always.

dandelany commented 8 years ago

So I just finished a batch of renders I started yesterday and they look pretty darn good... But I'm guessing 183c843e means I should try them again with the new code? :) Let me know when it's stable-ish enough to test again, can't wait to try it out.

dandelany commented 8 years ago

Well I couldn't help myself - I got impatient last night and tried a render with the latest master :smile: Results look really good but the runtime was more than 10x longer than before (2 hours vs 11 minutes) with a 70 second 1920x1080x60fps video on Ubuntu. You're probably aware of this issue already, but if not, let me know and I can provide more details. Thanks! [edit: opened #17 ]

dthpham commented 8 years ago

Please open a new issue for the performance regression. I've never noticed it myself but it's been on my radar ever since you mentioned it. Thanks!