Open MitchellAcoustics opened 3 years ago
Hi @MitchellAcoustics, thanks for checking out the repo. The x and y values are interpolated and normalized before they are transformed, so it should handle the issue of unevenly spaced x value input. Does this example help answer any questions?
import numpy as np
import matplotlib.pyplot as plt
from kneed import KneeLocator
n = 50
x = np.geomspace(1, 256, num=n)
y = np.linspace(0, 1, num=n)
plt.scatter(x, y)
Kneed seems to find a reasonable knee point.
kneed = KneeLocator(x, y, curve='concave', direction='increasing')
print(kneed.knee)
46.883891195871875
kneed.plot_knee()
Hi Thanks for this great algorithm. However I am struggling to get it work when x becomes log(x). For example for your same above example, if I did below, the found knee point is at end point which is not correct.
import numpy as np import matplotlib.pyplot as plt
from kneed import KneeLocator
n = 50 x = np.linspace(1, 256, num=n) x1=np.log(x) y = np.linspace(0, 1, num=n)
kneed = KneeLocator(x1, y, curve='convex', direction='increasing')
Hi
I keep playing with interpolation method, polynomial degree and sensitivity parameters and I found the results change a lot by changing these parameters. and the best result I got is still not optimal. Now my question is how do we select the parameters automatically? I am using this algorithm to detect elbow point automatically for any curve. Could you please help ? Thank you so much.
import numpy as np import matplotlib.pyplot as plt
from kneed import KneeLocator
n = 50
x = np.linspace(1, 256, num=n) x1=np.log(x) y = np.linspace(0, 1, num=n) kneed = KneeLocator(x1, y, S=1,curve='convex', direction='increasing',online=True,interp_method="polynomial",polynomial_degree=3)
print(kneed.knee) kneed.plot_knee_normalized() kneed.plot_knee()
If I may chime in, I think the issue is that only the y value is transformed. In knee_locator.py, there is a transform_y function that handles the entire transform. I think this could be fixed by adding a transform_x function that handles an x transformation.
def transform_x(x: Iterable[float], direction: str, curve: str) -> float:
"""transform x to concave, increasing based on given direction and curve"""
# convert elbows to knees
if direction == "decreasing" and curve == "concave":
x = x.max() - x
elif direction == "increasing" and curve == "convex":
x = x.max() - x
return x
def transform_y(y: Iterable[float], direction: str, curve: str) -> float:
"""transform y to concave, increasing based on given direction and curve"""
# convert elbows to knees
if curve == "convex":
y = y.max() - y
return y
Then changing this part of the code
# Step 2: normalize values
self.x_normalized = self.__normalize(self.x)
self.y_normalized = self.__normalize(self.Ds_y)
# Step 3: Calculate the Difference curve
self.y_normalized = self.transform_y(
self.y_normalized, self.direction, self.curve
)
self.x_normalized = self.transform_x(
self.x_normalized, self.direction, self.curve
)
# normalized difference curve
https://github.com/arvkevi/kneed/blob/174225bd407ebc661a6d583c68b2a40aa4070a84/kneed/knee_locator.py#L221
The transformation done to turn the curve into concave/increasing from a concave/decreasing curve doesn't work when you have unevenly spaced x values. When the np.flip(y) function is called it will match up the y values with the x values according to index, but not according to the actual original values. This means, if you have x values which get increasingly close together in the original curve, the transformed curve will actually become a convex/increasing curve.
Unfortunately I can't include a reproducible example right now, but hopefully this will demonstrate what I mean.
As you can see, in the original curve, the x values start quite far apart, then get increasingly close together. The y_transform flips the y values, but does not preserve their proper spacing relative to the curve so the shape doesn't change as expected. We can also see that the original curve would have a knee at around 1.5, but the transformed curve would have an elbow at around 3.85.
I don't really know of a way to fix this, but I find it a bit strange to transform the curve this way. I haven't dug through all of the source code, but if I understand the original paper correctly, the difference curve is calculated according to the difference from a straight line. Why not just change what straight line is being compared against for the difference calculation for each of the four convex/concave and increasing/decreasing combinations?
In the case of this example (i.e. concave/decreasing), the normalized straight line would be y = -x + 1 and it looks like that would result in a reasonable difference curve:
With that difference curve, it then appears that the knee could be correctly identified at around 0.5 (normalized x values). Then, for an increasing curve, the line to compare against is y = x, and for the convex curves, we take the absolute value of the difference.