fhirschmann / rdp

Python/Numpy implementation of the Ramer-Douglas-Peucker algorithm
https://pypi.python.org/pypi/rdp
MIT License
244 stars 117 forks source link

Recursion bug #5

Closed FlorianWilhelm closed 7 years ago

FlorianWilhelm commented 8 years ago

Although I love recursion and Ramer-Douglas-Peucker is naturally defined as a recursion, this is quite inefficient in Python and leads often to errors like RecursionError: maximum recursion depth exceeded in comparison. In my case I used a GPS track from a GPS sport-watch which has about 12072 points. Python's default recursion depth is 1000 and it can be set with sys.setrecursionlimit but this should never be used actually since you are shifting the problem just a bit away. Luckily, RDP can also be defined iteratively as shown here. Since your RDP implementation seems to be the only one on PyPI it would be nice to reformulate it as an iterative algorithm.

sebacastroh commented 8 years ago

I had the same problem, so I translated the code in the previous link to Python. You are free to use it.

from math import sqrt
import numpy as np

def _DouglasPeucker(points, startIndex, lastIndex, epsilon):
    stk = []
    stk.append([startIndex, lastIndex])
    globalStartIndex = startIndex
    bitArray = np.ones(lastIndex-startIndex+1, dtype=bool)

    while len(stk) > 0:
        startIndex = stk[-1][0]
        lastIndex = stk[-1][1]
        stk.pop()

        dmax = 0.
        index = startIndex

        for i in xrange(index+1, lastIndex):
            if bitArray[i - globalStartIndex]:
                d = PointLineDistance(points[i], points[startIndex], points[lastIndex])
                if d > dmax:
                    index = i
                    dmax = d
        if dmax > epsilon:
            stk.append([startIndex, index])
            stk.append([index, lastIndex])
        else:
            for i in xrange(startIndex + 1, lastIndex):
                bitArray[i - globalStartIndex] = False
    return bitArray

def DouglasPeucker(points, epsilon):
    bitArray = _DouglasPeucker(points, 0, len(points)-1, epsilon)
    resList = []
    for i in xrange(len(points)):
        if bitArray[i]:
            resList.append(points[i])
    return np.array(resList)

def PointLineDistance(point, start, end):
    if np.all(np.equal(start, end)) :
        return np.linalg.norm(point, start)
    n = abs((end[0] - start[0]) * (start[1] - point[1]) - (start[0] - point[0]) * (end[1] - start[1]))
    d = sqrt((end[0] - start[0]) * (end[0] - start[0]) + (end[1] - start[1]) * (end[1] - start[1]))
    return n/d
FlorianWilhelm commented 8 years ago

@sebacastroh Wow, thanks, I just wanted to start with it but it's definitely better this way 👍 @fhirschmann Do you need help with integrating that code? Since your package seems to be the most popular Python package for RDP I guess it's good to include it. Maybe even someone wants to add a Cython version based on that code later on.

fhirschmann commented 8 years ago

Sorry for the long delay!

I agree that without tail call optimization in Python, an iterative version is better suited.

I will start integrating the code of @sebacastroh and also quickcheck it against the recursive version.

@Namek, are you fine with integrating the above code which is based on your code to the Python rdp library?

Namek commented 8 years ago

@Namek, are you fine with integrating the above code which is based on your code to the Python rdp library?

@fhirschmann Sure! It's open source and has no specific license. If it's properly done and works for you then just take it.

fhirschmann commented 7 years ago

Iterative version added in v0.7. Thanks!