urschrei / simplification

Very fast Python line simplification using either the RDP or Visvalingam-Whyatt algorithm implemented in Rust
Other
166 stars 18 forks source link

Version 0.6 seems to add 0/0 Points? #63

Closed Jens-81 closed 11 months ago

Jens-81 commented 1 year ago

I use the simplification package to reduce the point density of multipolygon structures. Until version 0.5.12 this works perfect for me. In version 0.6+ it stopped working for some polygons. It seems to insert some 0/0 points.

In version 0.5.12: grafik

In Version 0.6.11: grafik

There is no exception thrown. I have tried different shapefiles, but the result is always as shown above (the only difference is how many of these "outliers" appear).

urschrei commented 1 year ago

Please confirm that this issue occurs when using version 0.6.0 and if possible please provide a minimal example data set that can be used to try to reproduce the issue.

Jens-81 commented 1 year ago

I tried the versions that pip offers me: grafik 0.5.12 works for me. With the next version, which is 0.6.1, the issue occurs. With the newest version, 0.6.11, the issue still occurs.

The data I used is free and available for download: https://daten.gdz.bkg.bund.de/produkte/vg/vg250_ebenen_1231/2013/ -> vg250_31-12.gk3.shape.ebenen.zip There are multiple shapefiles within the zip file: I used the "VG250_GEM"

urschrei commented 1 year ago

Hi @Jens-81. Please provide me with a https://en.wikipedia.org/wiki/Minimal_reproducible_example of the problem. As soon as I have a set of points (not a Shapefile) and an epsilon value that exhibit the problem I can try to find the bug, but I don't have time to download and process Shapefiles, write code to extract points, and guess at epsilon values.

vjsrinivas commented 11 months ago

@urschrei I'm seeing a similar bug with v0.6.11 on Python 3.7.16. I've created a minimal example here: https://gist.github.com/vjsrinivas/f42d72ebbc50b52202cd8e58442597c1

I run simplify_coords_vw in a loop, and this (0,0) coordinate always appears after the first frame. Even if I run the same frame twice, I get the (0,0) on the second try. So it seems like there's some internal state being set to (0,0)... or something like that. Let me know if you have any questions.

urschrei commented 11 months ago

@vjsrinivas Thanks, confirmed that the bug is present, but only when processing numpy arrays: it's not triggered by e.g. lists of lists using your "bad" coords.

urschrei commented 11 months ago

@utapyngo This is related to your refactor of cutil.pyx to make Numpy optional. I can't see anything obviously wrong, but if I can't fix the problem I'll be reverting this feature completely, so please have a look. The failure can be recreated by making an editable install, ensuring you have an up-to-date librdp.dylib (get it from https://github.com/urschrei/rdp/releases) and running pytest . – it's the first test.

utapyngo commented 11 months ago

How to reproduce this consistently? It fails for the first time, all consequent runs end up with pass:

$ pytest .
======================================================================= test session starts
platform linux -- Python 3.10.11, pytest-7.4.2, pluggy-1.3.0
collected 11 items                                                                                                                                                

test/test_simplification.py .......F...  [100%]

============================================================================ FAILURES
_____________________________________________________________ PolylineTests.testNumpy_optional_error

self = <test_simplification.PolylineTests testMethod=testNumpy_optional_error>

    def testNumpy_optional_error(self):
        """This test only fails when called with numpy arrays"""
        result = csimplify_coords_vw(np.array(self.bad_coords), 30)
>       self.assertEqual(result[0][0], 939.0)
E       AssertionError: 6.9122405872648e-310 != 939.0

test/test_simplification.py:519: AssertionError
===================================================================== short test summary info
FAILED test/test_simplification.py::PolylineTests::testNumpy_optional_error - AssertionError: 6.9122405872648e-310 != 939.0
================================================================== 1 failed, 10 passed in 0.18s
$ pytest .
======================================================================= test session starts
platform linux -- Python 3.10.11, pytest-7.4.2, pluggy-1.3.0
collected 11 items                                                                                                                                                

test/test_simplification.py ........... [100%]

======================================================================= 11 passed in 0.15s
$ pip freeze
exceptiongroup==1.1.3
iniconfig==2.0.0
numpy==1.26.0
packaging==23.1
pluggy==1.3.0
pytest==7.4.2
-e git+ssh://git@github.com/urschrei/simplification.git@bffc60ab0ddd1e07ee5f9adbad1bf233a4864dff#egg=simplification
tomli==2.0.1
urschrei commented 11 months ago

I can reliably reproduce it:

Cython==0.29.35
iniconfig==2.0.0
numpy==1.26.0
packaging==23.1
pluggy==1.2.0
pytest==7.4.0

Python version: 3.11.4, macOS

Inserting a print(arr) statement after line 65 in cutil.pyx: arr = np.array(coords, dtype=np.float64) makes the failure non-deterministic (fails about one in 10 times). Removing the conditional completely (so assuming that numpy is installed) also makes the bug disappear, which is…bad.

urschrei commented 11 months ago

Note that I'm building locally with python setup.py build_ext --inplace

utapyngo commented 11 months ago

I'm on Ubuntu, it does not fail for me anymore.

urschrei commented 11 months ago

I'm on Ubuntu, it does not fail for me anymore.

You now can't get the test to fail at all? Without changing anything?

urschrei commented 11 months ago

That seems to be in line with the test harness: https://github.com/urschrei/simplification/actions/runs/6214452260

Failures occur on Windows and macOS x86_64, pass on Linux.

utapyngo commented 11 months ago

I don't have access to mac or windows machine now.

urschrei commented 11 months ago

I'm not optimistic that this can be easily fixed (neither of us know what's going on or have access to one of the major platforms on which the bug manifests). I don't really have time to write and test a minimal reproduction of the bug for the Cython bug tracker / mailing list, so I'll be merging #67, releasing as 0.7.0, and yanking the 0.6.x releases. Feel free to reopen / revisit if you need to and / or have the time.

urschrei commented 11 months ago

Fixed by v0.7.0, now available on PyPI.