dthpham / sminterpolate

Make motion interpolated and fluid slow motion videos from the command line.
MIT License
1.38k stars 91 forks source link

I don't understand the advanced options and don't know where to look for information #59

Open JobLeonard opened 7 years ago

JobLeonard commented 7 years ago

Hey, first: thanks for this tool, I'm having a ton of fun with it! But I would like to have more control over the settings to fine-tune them, and for that I need to understand what they do first.

I mean these ones:

  --fast-pyr            Set to use fast pyramids
  --pyr-scale PYR_SCALE
                        Specify pyramid scale factor, (default: 0.5)
  --levels LEVELS       Specify number of pyramid layers, (default: 3)
  --winsize WINSIZE     Specify averaging window size, (default: 25)
  --iters ITERS         Specify number of iterations at each pyramid level,
                        (default: 3)
  --poly-n {5,7}        Specify size of pixel neighborhood, (default: 5)
  --poly-s POLY_S       Specify standard deviation to smooth derivatives,
                        (default: 1.1)
  -ff {box,gaussian}, --flow-filter {box,gaussian}
                        Specify which filter to use for optical flow
                        estimation, (default: box)

Now, I completely understand that the command line is not the best place to explain the fine details of a motion interpolation algorithm, but even after discovering in the docs that the interpolation algorithm is by Farneback, searching the internet is not helping much.

Pyramids of what? Scale does what? Window size for what? Blending? Looking for where the pixel moved to?

Actually, as I typed this I finally found a blogpost that gives some ideas what these parameters do. And even the it has to be inferred:

Result is computer in flowUmat which has same size as inputs but format is CV_32FC2

0.4- image pyramid or simple image scale 1 is number of pyramid layers. 1 mean that flow is calculated only from previous image. 12 is win size.. Flow is computed over the window larger value is more robust to the noise. 2 mean number of iteration of algorithm 8 is polynomial degree expansion recommended value are 5 - 7 1.2 standard deviation used to smooth used derivatives recommended values from 1.1 - 1,5

It would be nice if there was a bit more explanation in the butterflow docs (or alternatively, a link to a another page describing the algorithm) that helps understand what these parameters mean and what kind of effects they have.

Subash-Chandra commented 5 years ago

I took a 1 minute clip and ran it through with a bunch of different options, and compared some 25-30 different clips to figure out what these changes do. The biggest difference is switching the method from Box to Gaussian.

Increasing Pyramid layers helps ALOT for live-acation footage or drone footage with long shots with not many quick cuts. It seems to sort of suck for animated footage though.

I still don't understand how the standard deviation works. Can't see much of a difference in my clip between 1.1 and 1.5 in the drone footage.

I still haven't found a way to remove audio sync issues though.