mathandy / svgpathtools

A collection of tools for manipulating and analyzing SVG Path objects and Bezier curves.
MIT License
532 stars 134 forks source link

Path parsing floating point issue #206

Open BenVosper opened 1 year ago

BenVosper commented 1 year ago

Hi @mathandy. Thanks for your work on this library! We use it for all kinds of stuff.

Just wondering if you had any insight on this issue we're experiencing parsing some very simple path elements.

Parsing this path works fine:

from svgpathtools import parse_path

good = parse_path("m1400.52,3363.38H484.64v-259.71h915.88v259.71Z")

# Path(Line(start=(1400.52 + 3363.38j), end=(484.64 + 3363.38j)),
#      Line(start=(484.64 + 3363.38j), end=(484.64 + 3103.67j)),
#      Line(start=(484.64 + 3103.67j), end=(1400.52 + 3103.67j)),
#      Line(start=(1400.52 + 3103.67j), end=(1400.52 + 3363.38j)))

We correctly detect the four line segments with four unique points total representing a closed rectangle.

But this very similar path gives us a different result:

bad = parse_path("m1400.52,3408.97H484.64v-640.45h915.88v640.45Z")

# Path(Line(start=(1400.52 + 3408.97j), end=(484.64 + 3408.97j)),
#      Line(start=(484.64 + 3408.97j), end=(484.64 + 2768.5199999999995j)),
#      Line(start=(484.64 + 2768.5199999999995j), end=(1400.52 + 2768.5199999999995j)),
#      Line(start=(1400.52 + 2768.5199999999995j), end=(1400.52 + 3408.9699999999993j)),
#      Line(start=(1400.52 + 3408.9699999999993j), end=(1400.52 + 3408.97j)))

You can see here that at some point in the parsing we pick up a floating point error which results in the end of the fourth line not matching the starting point. We then seem to get a fifth additional line correcting the difference and closing the path due to the final Z command.

Below is an example SVG which shows that both paths seem to be rendering fine and are detected in tools like Inkscape as having four points only. The red path is bad above and the green one is good.

Do you know where this discrepancy might be coming from? I wonder if it is potentially related to https://github.com/mathandy/svgpathtools/issues/198. But I've tried your latest commit which includes your fix and the result is the same.

For reference this is running in Python 3.9.7 in Ubuntu.

mathandy commented 1 year ago

My guess is this comes from floating point error the gets picked up when figuring out the absolute coordinates of the points.

One workaround might be to remove the z from the end and do some hacky logic like:

from svgpathtools import parse_path, Line
import numpy as np

def parse_path_hack(d_string):
    ends_with_z = d_string.lower().endswith('z')
    _d_string = d_string[:-1] if ends_with_z else d_string
    path = parse_path(_d_string)
    if ends_with_z:
        path._closed = True

    if np.isclose(path[0].start, path[-1].end):
        path[-1].end = path[0].start
    else:
        path._segments.append(Line(path[-1].end, path[0].start))
    return path

I think I've seen this issue before and there must be some issue with this type of logic as I never fixed it. For your purposes, where all lines are vertical or horizontal this should be easy to detect and fix (for not just the end but for all segments, to make sure they stay horizontal and vertical).

I hope that helps. I liked your website by the way. Really cool projects.

BenVosper commented 1 year ago

@mathandy Thanks for looking into it. Your solution makes sense.

In our case we actually have arbitrary paths coming in (with any number of different commands and points). But we're actually just looking to test "is this path rectangular?" to a certain tolerance. We don't really care how many points there are.

So I'm wondering if this would be better for our purposes:

def path_is_rectangular(path, tolerance=0.1):
    area = path.area()
    xmin, xmax, ymin, ymax = path.bbox()
    bbox_area = abs((xmax - xmin) * (ymax - ymin))
    return np.isclose(area, bbox_area, atol=tolerance)

It seems to be working for all of our test cases. Just not sure if we're asking for trouble by hitting all the more complex bits of your path code (working out the area etc.). But maybe it's more resilient for us not to be worrying about the actual points. What do you think?

And thanks! I appreciate it :+1:

EDIT: I guess that's not a general solution. Since you could have rectangular paths that are "rotated" with respect to their bounding box. Back to the old way I suppose!