meerk40t / svgelements

SVG Parsing for Elements, Paths, and other SVG Objects.
MIT License
132 stars 29 forks source link

Bug: Line does not produce correct path string with .d() #130

Closed Esser50K closed 1 year ago

Esser50K commented 2 years ago

The Line produces the wrong path string.

l = Line(start=complex(9,0), end=(5,4))
l.d() -> "L 5,4"

The svgpathtools lib would produce -> M 9,0 L 5,4

I think that is probably correct since you need the start position to do anything

tatarize commented 2 years ago

Svgpathtools is uses implied move values. Svgelements does not. The value for the line there isn't in error since it's fundamentally a path fragment. Paths start with Move. If they don't then they don't have a real meaning. svgelements doesn't give the path as M9,0 L5,4 because that's specifically means Path(Move(end=Point(9,0)), Line(start=Point(9,0), end=Point(5,4))) The path you specified there doesn't have a Move which in svgelements and svg.path is an explicit segment type so it is not included. This is Move(complex(0,9)) + "L5,4" or Path("M0,9L5,4") regardless that the Line knows it's starting location that does not imply that it was a move to get there.

>>> c = (Move(complex(0,9)) + Line(end=Point(5,4)))
>>> c.validate_connections()
>>> Path(Move(end=Point(0,9)), Line(start=Point(0,9), end=Point(5,4)))
>>> c
Path(Move(end=Point(0,9)), Line(start=Point(0,9), end=Point(5,4)))

Would also work since you'd stick those fragments together and fix any connection problems they have. Though for a lot of fragments you can simply stick them together with +.

Since paths are required to begin with moves, path fragments are a little bit strange. But, it's actually a design choice to not claim there's a Move() in the path when there is not a move in the path.

(Move() + Path(Line(start=complex(9,0), end=(5,4)))).d() would very much give you what you're looking for there. svgpathtools is also based on svg.path versions 2+ code rather than svg.path version 4+ code which was generally before Move() was introduced. Basically without the segment type you can't do compound paths which is a deficit in svgpathtools.

tatarize commented 2 years ago

You do sort of need a start position to do things but you strictly speaking don't have the start position. But, you can still do more things with that segment than pretend its a full and correct path (Path("M0,0L20,20") + Line(start=complex(9,0), end=(5,4))) == Path(Move(end=Point(0,0)), Line(start=Point(0,0), end=Point(20,20)), Line(start=Point(20,20), end=Point(5,4)))

svgelements simply doesn't claim it has a Move() at the start of that path since it doesn't. You can append one and then it will give you the results you're expecting. But, it doesn't assume one. Since within svgelements a Move() is a real object rather than an implied object.

In svgpathtools: Path("M0,0ZM1,1Z") isn't a coherent path while it is certainly permitted by the SVG spec (1.1, 2.0). Svgpathtools has no Move() segment or Close() segment so complex paths with closed and unclosed portions basically have no meaning. It's intended to be much more specific and accurate. So yeah, it doesn't have a Move there because it doesn't have a move there. You can use the .as_subpath() values if you want the subpaths and feed them into the svgpathtools.Path() value and do the intersections or whatever you're mixing the code to accomplish there.

Esser50K commented 2 years ago

OK I understad. Sorry for posting all these issues, it seems like I'm trying to get something done with as little knowledge about svg as I can :P

Can you clarify how I could convert from svgelements.Path to svgpathtools.Path?

tatarize commented 2 years ago

It's absolutely fine. Getting things done without reading the spec multiple times (I have) should be considered a virtue.

svgpathtools.Path(svgelements_path.d()) seems like it would work. rather effectively. While there are slight differences between them svgpathtools parses most valid svg paths correctly. And if you wanted to convert between them using the path_d info is going to be the easiest method.

If you somehow have some compound paths in svgelements you would want to do: for p in path.as_subpath() -- and do each subpath individually. svgpathtools handles simple paths quite well. So getting the subpath.d() value and using that to initialize a svgpathtools.Path would be fine. Just always use the namespace and it would work perfectly fine.

tatarize commented 2 years ago

@Esser50K you may actually just be correct here and given #137 this is under review for a breaking change to change the path into inventing movetos where they would be needed for structurally valid Paths.

l = Line(start=complex(9,0), end=(5,4))
l.d() -> "L 5,4"

Would still work in that manner, but:

l = Path(Line(start=complex(9,0), end=(5,4)))
l.d() -> "M9,0L 5,4"

Would happen if we put that line into a Path(). The single Line pathsegment would still give it's solo value, but as a Path would actually make the moveto explicit. Still requires some thoughtwork and review but, it seems correct. This is mostly because fractional subpaths are actually a thing M9,0ZL5,4 should break into 2 subpaths of M9,0Z and L5,4 which is a problem for the L5,4 which is actually M9,0L5,4 with an implied move.

Sophist-UK commented 2 years ago

I am unsure here. It isn't helped by confusion between Line which is a PathSegment and SimpleLine which is a Shape.

I think that the issue here is that PathSegments should NOT have start points. They only have start points as a programming "convenience" (though personally I am not sure it is really a convenience given the amount of code needed to confirm that the start point of one PathSegment is the same as the end point of the previous one).

Sophist-UK commented 2 years ago

Consider the following piece of (valid) Python code:

l = Line(end=(10,10)
p1 = Path((Move("0,0"), l))
p2 = Path((Move("10,10"), l))

From a python perspective, I have quite reasonably used the same Line object as a sub-object in two Path objects, but because Path aligns the start points, something horrible happens:

>>> from svgelements import Move, Line, Path
>>> l = Line(end=(20,20))
>>> l.d()
'L 20,20'
>>> repr(l)
'Line(end=Point(20,20))'
>>> p1 = Path((Move(end=(0,0)),l))
>>> p1.d()
'M 0,0 L 20,20'
>>> repr(p1)
'Path(Move(end=Point(0,0)), Line(start=Point(0,0), end=Point(20,20)))'
>>> p2 = Path((Move(end=(10,10)),l))
>>> p2.d()
'M 10,10 L 20,20'
>>> repr(p2)
'Path(Move(end=Point(10,10)), Line(start=Point(10,10), end=Point(20,20)))'
>>> p1.d()
'M 0,0 L 20,20'
>>> repr(p1)
'Path(Move(end=Point(0,0)), Line(start=Point(10,10), end=Point(20,20)))'
tatarize commented 2 years ago

Hm. While that's certainly true, it's not exactly a fixable bug, rather it's a side effect of object manipulation and that sort of thing will always happen in most any program when you do things like that.

Sophist-UK commented 2 years ago

Only fixable by taking a copy of objects before adding them to the path. And not sure whether that is worthwhile.

I was only using this to illustrate that SVG PathSegments should not really have a Start point - but should take their start point every time from the previous segment. That does make coding more difficult, and it wouldn't be backwards compatible.

tatarize commented 2 years ago

There are clearly some instances of this stuff existing in the wild where the path goes from endpoint of previous link in the chain to the current one and falls into a proper path. The bigger issue is that the individual elements are meaningless without being in a path. In reality this decision was already made circa svg.path.

I'd maybe consider it as part of svgio there's a lot parts of this stuff to be considered and that are manifestly useful. The lack of validation for the paths (if they lack start points the end point and start point do not need to match) the lack of duplicated memory. The start and end points are made to equal each other but it doesn't reuse the same point because then matrix multiplication fails. Switching to a start-less segment sequence you could certainly avoid some of these pitfalls altogether.

But it's far too late and too breaky to ever consider those changes here.

tatarize commented 2 years ago

This wouldn't fix the start and end point thing though. You could insert the same segment in twice and it would still have that side effect thing if you did an affine transform. It's not as much a bug as a consequence of reused memory in a language.

Sophist-UK commented 2 years ago

I disagree. It is NOT reused memory - it is a deliberate consequence of objects. It is not reuse of memory - it is a link to the exact same object. And the fix is to copy the object when you add it to a path so that the two paths point to different objects rather than the same object.

But it's far too late and too breaky to ever consider those changes here.

Too late and breaky to remove the use of start= from svgelements. But copying the object when it is added to the Path should not be a breaking change.

So it is a bug, though as I said before unless someone complains about it in the wild probably not one worth fixing.

There are clearly some instances of this stuff existing in the wild

So we probably ought to fix it.

tatarize commented 2 years ago

Making pointless copies of things aren't needed. If they wanted to avoid that they should have added in a fresh copy or something. It's certainly a do not fix. This code is intended for coders not general end users. If you do weird things you'll get weird results. There would be noteworthy side effects changing that since the particular line segments would no longer be the same line segment. if path[1] is line: would start failing ect.

If any change is made it should be to the readme to say don't reuse the same pathsegment in a path more than once, and certainly not preventing that from happening.

Sophist-UK commented 2 years ago

I did say it was probably not worth fixing until you pointed out that these were problems cropping up in the wild.

Copying is unlikely to lead to more memory usage as in most cases the original object will have zero references after a copy is added to the path and be garbage collected.

But yes - one consequence would be if path[1] is line: would not work and you would have to use if path[1] == line: instead.

Sophist-UK commented 2 years ago
  1. If Line didn't have a start attribute then you wouldn't need to take a copy and you could use if path[1] is line: or if path[1] == line:. The downside is that you would have to refer to the previous PathSegment's end attribute to work out what the start point of the current PathSegment which will be a coding PITA.
  2. Otherwise, if you have a start attribute then you either take a copy when you add it to a Path (in which case only == will work) or you accept that if the PathSegment is reused, then if the caller ever wants to use the start attribute then it will be wrong some of the time.
tatarize commented 2 years ago

This is a judgement call and I can see doing other things. It would mean pathsegments would have little functionality outside of paths, which might be preferable, but you couldn't do things like know the length of a line, since the length of the line is not just its end point but also which point occurs before it in the path. This is a reasonable enough way to do this but it isn't the way svgpathelements and svg.path did things so this isn't the way this code did things, and it probably wouldn't be the way svgio would do things either.

But, the OP may be correct if we take a less fundamental shift that is more exclusive to svgelements and simply validate paths with moves at the start of them if there is a source location. It seems like there would be some edge cases but these could be mapped out and corrected for in consistent ways.

Sophist-UK commented 2 years ago

Also, we can probably get around the whole is vs. == as follows:

  1. PathSegment maintains a weak-reference list of path objects they are part of.
  2. When you refer to start, then if there is exactly one non-None object ref() in the weak-reference list, then we can determine the start attribute as the end attribute of path[path.index(ref()) - 1]. If there are two non-None objects in the weak-reference list, then the start point is indeterminate because the PathSegment is used multiple times and the start point cannot be determined without knowing the path you are referring to, and you raise an exception if you try to get the start attribute .

If we do this we can maintain backwards compatibility in use cases that do not currently break anyway.

We could also provide an alternative start_for_path(path) method which gets the start point when you explicitly provide the path.

Not sure whether this is worth the effort though.

tatarize commented 1 year ago

Original point was settled.