jliljebl / flowblade

Video Editor for Linux
GNU General Public License v3.0
2.67k stars 181 forks source link

Improvement to the Perspective filter #1145

Open schauveau opened 10 months ago

schauveau commented 10 months ago

I was trying to apply a perspective effect to a video clip and I noticed that Perspective filter is currently quite limited.
After reading the MLT and ffmpeg documentations I came up with a few simple changes to make it a lot more useful.

https://www.mltframework.org/plugins/FilterAvfilter-perspective/ https://ffmpeg.org/ffmpeg-filters.html#perspective

The most important is that the 'sense' argument should be added. The default 'sense=source' is suitable to do some perspective corrections but not to apply a new perspective to a clip or image. For that , you need 'sense=destination'.

<property name="av.sense" args="editor=combobox exptype=default cbopts=Source:0,Destination:1 displayname=Sense">1</property>

With 'sense=destination', it becomes possible to move the 4 corners of a clip or image to any position.

However, there are 2 issues when using Perspective with 'sense=destination'

The first issue is that the pixels on the screen border are repeated infinitely in each direction. This is a known issue of ffmpeg perspective https://trac.ffmpeg.org/ticket/8124 but I figured out that it can be solved by applying a crop filter before the perspective filter to remove one pixel pixels on each side.

The second issue is that the 4 points (x0,y0), (x1,y1), (x2,y2), (x3,y3) are controlling the displacement of the 4 corners of the screen which is often different from the 4 corners of the image or clip (when the aspect ratio does not match exactly). The solution is to insert a Position Scale filter before the Crop and the Perspective filters to use the whole screen. This is not strictly needed but that can greatly help if you want to position the 4 points precisely.

Here is a short video showing how to use Perspective with sense=destination

https://www.youtube.com/watch?v=XYsXt3-BJvQ

jliljebl commented 10 months ago

Thanks, agreed that adding av.sense property is useful. With 'sense=destination' it really only works as user would expect with the crop filter added, so I made the previous 'sense=source' default as that looks more 'non-buggy'. The workaround of using crop filter prior to adding is such vital piece of information that I'll add it to docs somehow.

I pushed the change in into master.

schauveau commented 10 months ago

You are fast to close :-) so not sure that you will read that.

I was thinking that it could be a good idea to create a new filter instead of adding a sense argument. The old filter one with sense=source could be renamed to "Perspective correct" while the new one with sense=destination could be "Perspective Add".

Also, there are no practical reasons to limit the point to the screen dimension especially when using sense=destination. And finally, using screen coordinates is not necessarily the best approach. During my experiment, I found that using relative coordinates to each corner makes a lot of sense and using positive directions toward the center is even better.

I made multiple versions with different input and output ranges. The 5th is the one I like the best. It is relative to the corner so setting all points to (0,0) will reset the transformation.

    <!-- for some reason avfilter.perspective gives errors:
            [filter avfilter.perspective] Unexpected return format
            [filter avfilter.perspective] Cannot get frame from buffer sink
        if initialized with SCREENSIZE_WIDTH and SCREENSIZE_HEIGHT.

        However, if set to these values later it works.

        Workaround is that we initialize it with hard coded values
    -->    
    <filter id="avfilter.perspective">
        <name>Perspective</name>
        <group>Distort</group>
    <property name="av.sense" args="editor=combobox exptype=default cbopts=Source:0,Destination:1 displayname=Sense">0</property>
    <property name="av.interpolation" args="editor=combobox exptype=default cbopts=Linear:linear,Cubic:cubic displayname=Interpolation">linear</property>
        <property name="av.x0" args="range_in=0,SCREENSIZE_WIDTH range_out=0,SCREENSIZE_WIDTH displayname=x0">10</property>
        <property name="av.y0" args="range_in=0,SCREENSIZE_HEIGHT range_out=0,SCREENSIZE_HEIGHT displayname=y0">10</property>
        <property name="av.x1" args="range_in=0,SCREENSIZE_WIDTH range_out=0,SCREENSIZE_WIDTH displayname=x1">500</property>
        <property name="av.y1" args="range_in=0,SCREENSIZE_HEIGHT range_out=0,SCREENSIZE_HEIGHT displayname=y1">0</property>
        <property name="av.x2" args="range_in=0,SCREENSIZE_WIDTH range_out=0,SCREENSIZE_WIDTH displayname=x2">0</property>
        <property name="av.y2" args="range_in=0,SCREENSIZE_HEIGHT range_out=0,SCREENSIZE_HEIGHT displayname=y2">400</property>
        <property name="av.x3" args="range_in=0,SCREENSIZE_WIDTH range_out=0,SCREENSIZE_WIDTH displayname=x3">550</property>
        <property name="av.y3" args="range_in=0,SCREENSIZE_HEIGHT range_out=0,SCREENSIZE_HEIGHT displayname=y3">420</property>
    </filter>

    <!-- Perspective 2 : 
       - the default sense is 'Destination' 
       - the input screen size is normalized 1000x1000 
       - all inputs ranges are 'extended'by 1000 in all directions thus alowing
         the 4 points to be located outside the screen
       - consequently, the default values (without any visible perspective)
         should be  
          x0=0
          y0=0
          x1=1000
          y1=0
          x2=0
          y2=1000
          x3=1000
          y3=1000   
      --> 
    <filter id="avfilter.perspective">
        <name>Perspective 2</name>
        <group>Distort</group>
    <property name="av.sense" args="editor=combobox exptype=default cbopts=Source:0,Destination:1 displayname=Sense">1</property>
    <property name="av.interpolation" args="editor=combobox exptype=default cbopts=Linear:linear,Cubic:cubic displayname=Interpolation">linear</property>
        <property name="av.x0" args="range_in=-1000,2000 range_out=SCREEN33_XMIN,SCREEN33_XMAX displayname=x0">0</property>
        <property name="av.y0" args="range_in=-1000,2000 range_out=SCREEN33_YMIN,SCREEN33_YMAX displayname=y0">0</property>
        <property name="av.x1" args="range_in=-1000,2000 range_out=SCREEN33_XMIN,SCREEN33_XMAX displayname=x1">500</property>
        <property name="av.y1" args="range_in=-1000,2000 range_out=SCREEN33_YMIN,SCREEN33_YMAX displayname=y1">0</property>
        <property name="av.x2" args="range_in=-1000,2000 range_out=SCREEN33_XMIN,SCREEN33_XMAX displayname=x2">0</property>
        <property name="av.y2" args="range_in=-1000,2000 range_out=SCREEN33_YMIN,SCREEN33_YMAX displayname=y2">300</property>
        <property name="av.x3" args="range_in=-1000,2000 range_out=SCREEN33_XMIN,SCREEN33_XMAX displayname=x3">500</property>
        <property name="av.y3" args="range_in=-1000,2000 range_out=SCREEN33_YMIN,SCREEN33_YMAX displayname=y3">300</property>
    </filter>

    <!-- Perspective 3 is similar to Perspective 2 with the following changes:
       - the input and output ranges are defined such that the 'default' location of each point corresponds to input (0,0)
       - the x,y coordinates of each point are now relative to the corresponding corner point.
       - the displayname are changed to dx0, dy0, ... to reflect that. 
       - the input ranges are all sets to -1000,1000 so the 'default' 0 is nicely centered.
       - in practice, that means that the 4 points cannot cross the two opposite sides of the screen.     
      --> 
    <filter id="avfilter.perspective">
        <name>Perspective 3</name>
        <group>Distort</group>
    <property name="av.sense" args="editor=combobox exptype=default cbopts=Source:0,Destination:1 displayname=Sense">1</property>
    <property name="av.interpolation" args="editor=combobox exptype=default cbopts=Linear:linear,Cubic:cubic displayname=Interpolation">linear</property>
        <property name="av.x0" args="range_in=-1000,1000 range_out=SCREEN33_XMIN,SCREENSIZE_WIDTH displayname=dx0">10</property>
        <property name="av.y0" args="range_in=-1000,1000 range_out=SCREEN33_YMIN,SCREENSIZE_HEIGHT displayname=dy0">10</property>
        <property name="av.x1" args="range_in=-1000,1000 range_out=0,SCREEN33_XMAX displayname=dx1">500</property>
        <property name="av.y1" args="range_in=-1000,1000 range_out=SCREEN33_YMIN,SCREENSIZE_HEIGHT displayname=dy1">0</property>
        <property name="av.x2" args="range_in=-1000,1000 range_out=SCREEN33_XMIN,SCREENSIZE_WIDTH displayname=dx2">20</property>
        <property name="av.y2" args="range_in=-1000,1000 range_out=0,SCREEN33_YMAX displayname=dy2">350</property>
        <property name="av.x3" args="range_in=-1000,1000 range_out=0,SCREEN33_XMAX displayname=dx3">400</property>
        <property name="av.y3" args="range_in=-1000,1000 range_out=0,SCREEN33_YMAX displayname=dy3">400</property>
    </filter>

    <!-- Perspective 4 is similar to Perspective 3 with the following changes:
     - the input ranges are now using screen coordinates but are still relative to the 4 corner points.
     - so setting a point to (0,0) moves it to it corner
      -->
    <filter id="avfilter.perspective">
        <name>Perspective 4</name>
        <group>Distort</group>
    <property name="av.sense" args="editor=combobox exptype=default cbopts=Source:0,Destination:1 displayname=Sense">1</property>
    <property name="av.interpolation" args="editor=combobox exptype=default cbopts=Linear:linear,Cubic:cubic displayname=Interpolation">linear</property>
        <property name="av.x0" args="range_in=SCREEN33_XMIN,SCREENSIZE_WIDTH range_out=SCREEN33_XMIN,SCREENSIZE_WIDTH displayname=dx0">0</property>
        <property name="av.y0" args="range_in=SCREEN33_YMIN,SCREENSIZE_HEIGHT range_out=SCREEN33_YMIN,SCREENSIZE_HEIGHT displayname=dy0">0</property>
        <property name="av.x1" args="range_in=SCREEN33_XMIN,SCREENSIZE_WIDTH range_out=0,SCREEN33_XMAX displayname=dx1">500</property>
        <property name="av.y1" args="range_in=SCREEN33_YMIN,SCREENSIZE_HEIGHT range_out=SCREEN33_YMIN,SCREENSIZE_HEIGHT displayname=dy1">0</property>
        <property name="av.x2" args="range_in=SCREEN33_XMIN,SCREENSIZE_WIDTH range_out=SCREEN33_XMIN,SCREENSIZE_WIDTH displayname=dx2">0</property>
        <property name="av.y2" args="range_in=SCREEN33_YMIN,SCREENSIZE_HEIGHT range_out=0,SCREEN33_YMAX displayname=dy2">350</property>
        <property name="av.x3" args="range_in=SCREEN33_XMIN,SCREENSIZE_WIDTH range_out=0,SCREEN33_XMAX displayname=dx3">400</property>
        <property name="av.y3" args="range_in=SCREEN33_YMIN,SCREENSIZE_HEIGHT range_out=0,SCREEN33_YMAX displayname=dy3">400</property>
    </filter>

    <!-- Perspective 5 is similar to Perspective 4 with the following changes:
     - The coordinates are still given relative to each corner but with the positive direction toward the center.
     - The valid range is extended so that all points can be moved anywere in the 'SCREEN33' area.  
      -->
    <filter id="avfilter.perspective">
        <name>Perspective 5</name>
        <group>Distort</group>
    <property name="av.sense" args="editor=combobox exptype=default cbopts=Source:0,Destination:1 displayname=Sense">1</property>
    <property name="av.interpolation" args="editor=combobox exptype=default cbopts=Linear:linear,Cubic:cubic displayname=Interpolation">linear</property>
        <property name="av.x0" args="range_in=SCREEN33_XMIN,SCREEN33_XMAX range_out=SCREEN33_XMIN,SCREEN33_XMAX displayname=dx0">0</property>
        <property name="av.y0" args="range_in=SCREEN33_YMIN,SCREEN33_YMAX range_out=SCREEN33_YMIN,SCREEN33_YMAX displayname=dy0">0</property>
        <property name="av.x1" args="range_in=SCREEN33_XMIN,SCREEN33_XMAX range_out=SCREEN33_XMAX,SCREEN33_XMIN displayname=dx1">500</property>
        <property name="av.y1" args="range_in=SCREEN33_YMIN,SCREEN33_YMAX range_out=SCREEN33_YMIN,SCREEN33_YMAX displayname=dy1">0</property>
        <property name="av.x2" args="range_in=SCREEN33_XMIN,SCREEN33_XMAX range_out=SCREEN33_XMIN,SCREEN33_XMAX displayname=dx2">0</property>
        <property name="av.y2" args="range_in=SCREEN33_YMIN,SCREEN33_YMAX range_out=SCREEN33_YMAX,SCREEN33_YMIN displayname=dy2">350</property>
        <property name="av.x3" args="range_in=SCREEN33_XMIN,SCREEN33_XMAX range_out=SCREEN33_XMAX,SCREEN33_XMIN displayname=dx3">400</property>
        <property name="av.y3" args="range_in=SCREEN33_YMIN,SCREEN33_YMAX range_out=SCREEN33_YMAX,SCREEN33_YMIN displayname=dy3">400</property>
    </filter>

The values SCREEN33_XMIN ... SCREEN33_XMAX are defined a follow in propertyparse.py

def get_args_num_value(val_str):
    """
    Returns numerical value for expression in property
    args. 
    """
    try: # attempt int
        return int(val_str)
    except:
        try:# attempt float
            return float(val_str)
        except:
            w = current_sequence().profile.width()
            h = current_sequence().profile.height()
            # attempt expression
            if val_str == SCREENSIZE_WIDTH:
                return w
            elif val_str == SCREENSIZE_HEIGHT:
                return h
            # SCREEN33 is a virtual 3x3 screen centered around the image.
            # Used for control points outside the visible image in Perspective
            elif val_str == "SCREEN33_XMIN":
                return -w
            elif val_str == "SCREEN33_XMAX":
                return 2*w
            elif val_str == "SCREEN33_YMIN":
                return -h
            elif val_str == "SCREEN33_YMAX":
                return 2*h
    return None
jliljebl commented 10 months ago

Ok, I'll reopen. Maybe do pull request, so I'll get all proposed changes in together. I think I can copy-paste from here, but pull request leaves you as the author in git log, and this helps me remember to give attribution when writing release notes and adding you as contributor.

Also, there are no practical reasons to limit the point to the screen dimension especially when using sense=destination.

# SCREEN33 is a virtual 3x3 screen centered around the image.

Ok, I get it. I did a similar trick for cairoaffineblend when I contributed it to frei0r as frei0r requires all params to be in range 0.0-1.0. User being able to place image corners outside of screen area is indeed useful new functionality, so if perspective 5 + propertyparse.py change does that let's put that in. If you prefer I do it, I can do it.

schauveau commented 10 months ago

And of course, there is still the problem of setting a proper default value for the 4 points. If I understand correctly, the issue is mostly because symbols such as SCREENSIZE_WIDTH and SCREENSIZE_HEIGHT are not properly interpreted in the value field (only in xml attributes). I can look into it.

jliljebl commented 10 months ago

Ok, I'll wait for more input on setting proper default values for the 4 corner points.

schauveau commented 10 months ago

I could do a pull request but since I have multiple versions with different behaviors I think that this is more about figuring out which one is best. I should probably do a proper fork so that other can try it.