shapely / shapely

Manipulation and analysis of geometric objects
https://shapely.readthedocs.io/en/stable/
BSD 3-Clause "New" or "Revised" License
3.9k stars 566 forks source link

First half of result from `split()` contains a duplicate splitter point #841

Closed rogeryen closed 2 years ago

rogeryen commented 4 years ago

Expected behavior and actual behavior.

Expected (result from 1.6.4.post2)

[(-94.47979092273567, 39.0521128083418), (-94.47984161377944, 39.05210062401601), (-94.4798583198375, 39.05210970887984), (-94.4798441824597, 39.05211270208803), (-94.47982657282334, 39.05211643045326), (-94.47981732439999, 39.0521183885573), (-94.47980377570408, 39.05212125645163), (-94.47979778892203, 39.05211732140793)]

Actual (result from 1.7.0)

[(-94.47979092273567, 39.0521128083418), (-94.47984161377944, 39.05210062401601), (-94.4798583198375, 39.05210970887984), (-94.4798441824597, 39.05211270208803), (-94.47982657282334, 39.05211643045326), (-94.47981732439999, 39.0521183885573), (-94.47980377570408, 39.05212125645163), (-94.47979778892203, 39.05211732140793), (-94.47979778892203, 39.05211732140793)]

You can see the end point (the splitter) is duplicated.

Looking at the source I am guessing there's either some small difference between how the distance of each segment is calculated vs project(), or that with the **0.5 there is some precision loss happening when there is a larger number of decimal places involved. Wonder if on top of comparing the accumulation of each segment distance, it would be acceptable to also do an equality check of the splitter coords against each point on the line?

Steps to reproduce the problem.

Run the following script

from shapely import ops
from shapely.geometry import LineString, Point

ls = LineString([[-94.47979092273567, 39.0521128083418], [-94.47984161377944, 39.05210062401601], [-94.4798583198375, 39.05210970887984], [-94.4798441824597, 39.05211270208803], [-94.47982657282334, 39.05211643045326], [-94.47981732439999, 39.0521183885573], [-94.47980377570408, 39.05212125645163], [-94.47979778892203, 39.05211732140793], [-94.47979092273567, 39.0521128083418]])

# pt is in ls
pt = Point([-94.47979778892203, 39.05211732140793])
lines = ops.split(ls, pt)
llines = list(lines)

print(list(lines[0].coords))
print(list(lines[1].coords))

Operating system

MacOS 10.15.2

Shapely version and provenance

Python 3.7.6 with Shapely 1.7.0 installed from pip

sgillies commented 4 years ago

@rogeryen we did change the logic of splitting a line by a point between 1.6.4.post2 and 1.7.0 (see https://github.com/Toblerity/Shapely/pull/624) so it is possible that we introduced a new bug. Thanks for the report. We'll look into it. Any help you can provide would be greatly appreciated.

snorfalorpagus commented 4 years ago

I can't reproduce this using Shapely 1.7.0 from PyPI on Windows.

print(list(lines[0].coords))
print(list(lines[1].coords))

assert list(lines[0].coords) == list(ls.coords[:8])
assert list(lines[1].coords) == list(ls.coords[-2:])
[(-94.47979092273567, 39.0521128083418), (-94.47984161377944, 39.05210062401601), (-94.4798583198375, 39.05210970887984), (-94.4798441824597, 39.05211270208803), (-94.47982657282334, 39.05211643045326), (-94.47981732439999, 39.0521183885573), (-94.47980377570408, 39.05212125645163), (-94.47979778892203, 39.05211732140793)]
[(-94.47979778892203, 39.05211732140793), (-94.47979092273567, 39.0521128083418)]
sgillies commented 4 years ago

Thanks @snorfalorpagus. I can't reproduce it with a 1.7.0 manylinux1 wheel on my thinkpad, either. But I can reproduce it on my macbook with a 1.7.0 macosx wheel. My hypothesis is we're arriving at https://github.com/Toblerity/Shapely/blob/master/shapely/ops.py#L408 due to an equality test at line 399 that fails due to platform specific numerical precision issues.

The good news is that from a geometry point of view the duplicate vertex doesn't change the length of the line or its other properties.

rcriii42 commented 2 years ago

I am seeing this same error pop up about 1 in 1000 runs of this code:

from itertools import chain
import math

from shapely.geometry import Point, LineString
from shapely.ops import nearest_points, split

...

    cl_pos = CL_shape.interpolate(sta - stations[0])
    # Split the line at cl_pos
    all_points_coords = chain(CL_shape.coords, cl_pos.coords)   # https://stackoverflow.com/a/56424736
    all_points = map(Point, all_points_coords)
    new_line = LineString(sorted(all_points, key=CL_shape.project))
    # Find the segment that ends at cl_pos
    L1, L2 = split(new_line, cl_pos)
    p1 = L1.coords[-2]
    p2 = L1.coords[-1]  # This is the same as cl_pos
    # Equation for the line perpendicular to the CL at cl_pos
    # https://stackoverflow.com/a/1811556
    try:
        m = (p1[1] - p2[1])/(p1[0] - p2[0])
    except ZeroDivisionError:
        print(f'station: {sta:0.1f} cl_pos: {cl_pos}')
        print(f'{new_line}')
        print(f'L1: {L1}')
        print(f'p1: {p1}  p2: {p2}')
        raise

Traceback:

station: 11324.2 cl_pos: POINT (3240280.062861705 13792447.723857)
LINESTRING (3231527.233 13791975.57, 3240280.062861705 13792447.723857, 3250940.319 13793022.77, 3252992.448 13793779.8)
L1: LINESTRING (3231527.233 13791975.57, 3240280.062861705 13792447.723857, 3240280.062861705 13792447.723857)
p1: (3240280.0628617047, 13792447.723857)  p2: (3240280.0628617047, 13792447.723857)
ERROR:tornado.application:Uncaught exception GET /ws (127.0.0.1)
HTTPServerRequest(protocol='http', host='127.0.0.1:8521', method='GET', uri='/ws', version='HTTP/1.1', remote_ip='127.0.0.1')
Traceback (most recent call last):
  File "/home/rcriii/workspace/Steelhead/venv/lib/python3.8/site-packages/tornado/websocket.py", line 647, in _run_callback
    result = callback(*args, **kwargs)
  File "/home/rcriii/workspace/Steelhead/venv/src/mesa-geo/mesa_geo/visualization/ModularVisualization.py", line 199, in on_message
    self.application.model.step()
  File "/home/rcriii/workspace/Steelhead/model.py", line 441, in step
    self.schedule.step()
  File "/home/rcriii/workspace/Steelhead/venv/lib/python3.8/site-packages/mesa/time.py", line 84, in step
    agent.step()
  File "/home/rcriii/workspace/Steelhead/model.py", line 251, in step
    self.station = self._station + self.calculate_travel(self.production_factor, TE_factor)
  File "/home/rcriii/workspace/Steelhead/model.py", line 218, in station
    dr_cl = find_pos_from_station(self.model.channel_sections[sec].shape,
  File "/home/rcriii/workspace/Steelhead/utils.py", line 32, in find_pos_from_station
    m = (p1[1] - p2[1])/(p1[0] - p2[0])
ZeroDivisionError: float division by zero

Environment:

$ pip show shapely
Name: Shapely
Version: 1.8.0
Summary: Geometric objects, predicates, and operations
Home-page: https://github.com/Toblerity/Shapely
Author: Sean Gillies
Author-email: sean.gillies@gmail.com
License: BSD
Location: /home/rcriii/workspace/Steelhead/venv/lib/python3.8/site-packages
Requires: 
Required-by: geopandas
Python 3.8.10 (default, Nov 26 2021, 20:14:08) 
[GCC 9.3.0] on linux

I guess a workaround will be to check and take L1.coords[-3].