Knot vectors synthesized for curve fitting are biased toward zero

GarthSnyder commented 3 years ago

I've been noticing that some of the splines produced by fitting.approximate_curve seem to have a little hump at one end. Here, for dramatic effect, is a particularly striking example. This is what the input points actually look like:

Screen Shot 2021-02-02 at 5 19 49 PM

And this is the fit:

Screen Shot 2021-02-02 at 5 23 00 PM

I'm showing it in Fusion 360 because it gives a nice view of the control point cage (the dashed orange lines), but you should be able to reproduce it in any visualizer. (The built-in visualizer for NURBS-Python picks a viewpoint that doesn't really show off the issue.)

import geomdl
import geomdl.fitting as fit

points = [
  [-13.132, 21.389, -2.51],
  [-13.065, 21.306, -2.539],
  [-13, 21.225, -2.577],
  [-12.941, 21.151, -2.633],
  [-12.889, 21.088, -2.707],
  [-12.848, 21.038, -2.794],
  [-12.821, 21.001, -2.892],
  [-12.814, 20.992, -2.946],
  [-12.812, 20.989, -3],
  [-12.809, 20.986, -3.055],
  [-12.802, 20.977, -3.11],
  [-12.774, 20.939, -3.21],
  [-12.731, 20.887, -3.298],
  [-12.678, 20.822, -3.371],
  [-12.617, 20.747, -3.428],
  [-12.551, 20.664, -3.465],
  [-12.481, 20.58, -3.492]  
]

curve = fit.approximate_curve(points, degree=3)

The artifacts are reminiscent of the curve degree being too high for the context, but even degree=2 shows the same problem. (degree=1 shows a proper piecewise linear fit.)

I cross-checked this against the fitting of Splipy, and given the same data points and knot vector, Splipy comes up with identical control points. That makes me suspect the knot vector generated by geomdl. I would cross-check to see what knots Splipy comes up with, but as far as I can tell, Splipy wants you to supply the knots.

After looking into this a bit, I think there may be a problem with the way compute_knot_vector2 is interpolating the u-sub-k parameters (that is, the relative positions between 0 and 1 of the actual data points along the hypothetical curve). Here's the original source code for reference:

def compute_knot_vector2(degree, num_dpts, num_cpts, params):
    """ Computes a knot vector ensuring that every knot span has at least one :math:`\\overline{u}_{k}`.

    Please refer to the Equations 9.68 and 9.69 on The NURBS Book (2nd Edition), p.412 for details.

    :param degree: degree
    :type degree: int
    :param num_dpts: number of data points
    :type num_dpts: int
    :param num_cpts: number of control points
    :type num_cpts: int
    :param params: list of parameters, :math:`\\overline{u}_{k}`
    :type params: list, tuple
    :return: knot vector
    :rtype: list
    """
    # Start knot vector
    kv = [0.0 for _ in range(degree + 1)]

    # Compute "d" value - Eqn 9.68
    d = float(num_dpts) / float(num_cpts - degree)
    # Find internal knots
    for j in range(1, num_cpts - degree):
        i = int(j * d)
        alpha = (j * d) - i
        temp_kv = ((1.0 - alpha) * params[i - 1]) + (alpha * params[i])
        kv.append(temp_kv)

    # End knot vector
    kv += [1.0 for _ in range(degree + 1)]

    return kv

Thanks to your wonderful and thorough references, I was able to look up the relevant section of The NURBS Book 2E for comparison. The code does not follow the book; however, that seems understandable because I'm not sure I believe the book on this, starting with the mysterious statement that "we need a total of n + p + 2 knots" with code to match. Is the correct number not n + p + 1? Evidently, you or whoever wrote the original original contribution didn't believe the book either.

But I think the code may be wrong, too. It biases the knots toward zero, resulting in an artificially small first interval. I think that's what's causing the contortions in the fitted splines.

OK, a thought experiment. (My apologies -- this is going to sound like I'm lecturing the author of a NURBS package about basic spline arithmetic, but these details are new to me and I just want to write them down to be sure I'm grasping them.)

Suppose we have 11 data points that are completely evenly spaced and we want 10 control points for a degree 3 spline. There are 7 knot spans and 14 entries in the knot vector, of which 8 are 0s or 1s. We need to generate 6 intermediate knots because the 0s already initiate the first knot span. (The 1s terminate; they don't introduce a new span.) The u-sub-ks are:

>>> params = np.linspace(0, 1, 11)
array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ])

In the current code, num_dpts is 11, num_cpts is 10, and degree is 3, so d = 11 / 7 = 1.57. When j = 1, i = 1 and alpha = 0.57. The start of the second knot span is computed as (1 - 0.57) * params[0] + 0.57 * params[1] = 0.43 * 0 + 0.57 * 0.1 = 0.057. Not obviously implausible, right?

But let's look at the last knot when j = num_cpts - degree - 1 = 6. Then d = 9.42 and the last knot span starts at (1 - 0.42) * 0.8 + 0.42 * 0.9 = 0.842. So the last knot span is almost three times larger than the first, even though the params vector is uniform.

I think there are two issues. First, I don't understand why The NURBS Book uses params[i-1] and params[i] rather than params[i] and params[i+1], a convention the code also follows. 1.57 is an address between the second and third data points, not between the first and second, no? Clearly, it should include at least one full data point. However, fixing this just shifts the squeezed area from the front of the knot vector to the end, so there's a scaling issue in there somewhere, too.

I suspect that the numerator of d should be not num_dpts but num_dpts - 1. A set of 11 numbers has only an interval of 10 to distribute. The name of the 11th element is not params[11] but params[10]. Think of it as working in list/array addressing space.

Together, these two changes generate a nice, symmetrical knot vector from the nice, symmetrical params:

num_dpts = 11
num_cpts = 10
degree = 3
params = np.linspace(0, 1, 11)

d = (num_dpts - 1) / (num_cpts - degree)

def knot(d, j):
    i = int(j * d)
    alpha = (j * d) - i
    return (1 - alpha) * params[i] + alpha * params[i + 1]

[knot(d, j) for j in range(1, num_cpts - degree)]

[0.14285714285714288,
 0.28571428571428575,
 0.42857142857142855,
 0.5714285714285715,
 0.7142857142857144,
 0.8571428571428572]

I will do some additional tests tomorrow. If they look reasonable, would you be interested in a PR for this?

orbingol commented 3 years ago

The approximation algorithm in the NURBS Book requires the number of control points to be generated. My personal approach was to make it simple by setting the number of control points = data points - 1. It doesn't have to be correct all the time.

Having said that, setting ctrlpts_size keyword argument to 9 seems to fix the issue.

curve = fit.approximate_curve(points, degree=3, ctrlpts_size=9)
curve.vis = vis.VisCurve3D()
curve.render()

GarthSnyder commented 3 years ago

Having said that, setting ctrlpts_size keyword argument to 9 seems to fix the issue.

Sure, you can disguise the distortion by lowering the number of control points. That widens the knot spans and reduces the ringing introduced by the polynomial fit.

However, my point isn't that the distortion is visually displeasing; it's that the curve fit itself is incorrect. And the reason it's incorrect is that the knot vector doesn't actually conform to The NURBS Book p412.

Here's a more direct way to think about this: NURBS are completely symmetric, in the sense that reversing the parametric direction is a trivial operation. Right? Neither the head nor the tail of the curve is privileged in any way. Similarly, the curve fit operation should be absolutely symmetric. If you fit a set of points and then fit the points in reversed order, you should get an identical (but reversed) spline. Right?

But you don't:

import numpy as np
import geomdl
import geomdl.fitting as fit
import geomdl.visualization.VisMPL as vis
import math

points2D = [[x, math.exp(x) - 2 * math.sin(x) * math.cos(x)] for x in np.linspace(0, 1, 6)]

curve = fit.approximate_curve(points2D, degree=3)
revCurve = fit.approximate_curve(list(reversed(points2D)), degree=3)

# Sorry, wasn't sure how to do an overlay here, just forced it
class MyVisCurve2D(vis.VisCurve2D):
    def clear(self):
        pass

vc = vis.VisConfig(figure_size=(6.0, 8.0), axes_equal=True)
v = MyVisCurve2D(vc)
curve.vis = v
revCurve.vis = v
curve.render()
revCurve.render()

CurveOverlay

I believe the reason for this is that the knot vectors are not inverses of each other.

In some sense, the knot vector is an input to the fitting process. The least-squares calculation is performed relative to the given knot vector, and there isn't a single "right" knot vector. (Or at least, it takes quite a bit of computation to discover the optimal vector.)

Nevertheless, as p412 states, there needs to be at least one data point introduced in each knot span to ensure that the least-squares problem is solvable and numerically stable. The current code doesn't guarantee this.

I believe this is actually the underlying reason for the zero-division error reported in #119. There is a zero element along the diagonal of the upper-triangular matrix because the input matrix from which it's derived does not have full rank. If you look at the knot vector, you can see why:

> # points are as defined in issue #119 
> params = fit.compute_params_curve(points)
> params[:5]
[0.0,
 0.016836418434463478,
 0.03294468431943961,
 0.04836032059528571,
 0.0631164214829355]

> kv = fit.compute_knot_vector2(5, len(params), len(params) - 1, params)
> kv[:10]
[0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0012319330561802555,
 0.019193725637142913,
 0.03632860447755217,
 0.05267917939167102]

There is no data point introduced between knots 0.0 and 0.00123. [EDIT: I originally wrote "between parameters..."; not what I meant.]

Fixing the code as suggested above (that is, setting d = float(num_dpts - 1) / float(num_cpts - degree) and setting temp_kv = ((1.0 - alpha) * params[i]) + (alpha * params[i+1])) fixes the fitting problem in the original report and also seems to fix #119.

These are just two garden-variety off-by-one errors.

orbingol commented 3 years ago

For the symmetric example, setting ctrlpts_size seems to fix the issue. I haven't checked the knot vector directly.

import numpy as np
import geomdl
import geomdl.fitting as fit
import geomdl.visualization.VisMPL as vis
import math

points2D = [[x, math.exp(x) - 2 * math.sin(x) * math.cos(x)] for x in np.linspace(0, 1, 6)]

curve = fit.approximate_curve(points2D, degree=3, ctrlpts_size=len(points2D) - 2)
revCurve = fit.approximate_curve(list(reversed(points2D)), degree=3, ctrlpts_size=len(points2D) - 2)

class MyVisCurve2D(vis.VisCurve2D):
    def clear(self):
        pass

vc = vis.VisConfig(figure_size=(6.0, 8.0), axes_equal=True)
v = MyVisCurve2D(vc)
curve.vis = v
revCurve.vis = v
curve.render()
revCurve.render()

Figure_1

I am not sure about the problem related to the knot vector. As you stated, the knot vector is also an input to the approximation. Therefore, it might be a good idea to let the user use any knot vector.

orbingol commented 3 years ago

Ah! When the number of control points is degree - 1, then it is a Bezier curve. My mistake...

The problem reappears when I increase the number of data points.

points2D = [[x, math.exp(x) - 2 * math.sin(x) * math.cos(x)] for x in np.linspace(0, 1, 8)]

Setting centripetal=True also makes some difference for this case.

GarthSnyder commented 3 years ago

Submitted #121 if you would like to take a closer look at this. It includes fairly extensive tests. The existing code fails the symmetry tests, but if those tests are removed, it also fails the interleave tests in some circumstances. I also included the scenario from #119 as a separate test.

orbingol / NURBS-Python

Knot vectors synthesized for curve fitting are biased toward zero #120