oceanmodeling / xarray-selafin

An xarray engine for opening Selafin files (TELEMAC)
https://pypi.org/project/xarray-selafin/
The Unlicense
4 stars 2 forks source link

remove netCDF4 and Shapely from the dependencies? #26

Closed nicogodet closed 5 months ago

nicogodet commented 8 months ago

Title says it all :)

tomsail commented 8 months ago

indeed good point. We've implemented netcdf to test exports in io_test.py

tomsail commented 8 months ago

I might also add shapely because it is only used here:

from shapely.geometry import LinearRing
[...]
            # Determine if boundary is clock-wise
            if len(current_boundary_nodes) > 2:
                boundary_geom = LinearRing(
                    [
                        (self.x[n - 1], self.y[n - 1])
                        for n in current_boundary_nodes[:-1]
                    ]
                )
                ccw = boundary_geom.is_ccw
            else:
                ccw = True  # arbitrary value (order can not be defined!)

in Serafin.py

If we find another implementation we could also get rid of the dependency to shapely

pmav99 commented 6 months ago

The are other issues with the dependency metadata.

$ curl --silent https://pypi.org/pypi/xarray-selafin/json | jq '.info.requires_dist'
[
  "numpy",
  "pytest",
  "scipy",
  "shapely",
  "xarray",
  "netcdf4",
  "dask; extra == \"dask\"",
  "matplotlib; extra == \"dev\""
]

For instance:

  1. why is pytest getting installed when you install xarray-selafin?
  2. why do we install the dev "extra" for matplotlib
  3. dask doesn't have an "extra" named dask
  4. There is also #2 which is related.
tomsail commented 6 months ago

The pyproject.toml has a few inconsistencies indeed:

[tool.poetry.dependencies]
python = ">=3.9"
numpy = "*"
pytest = "*"
scipy = "*"
shapely = "*"
xarray = "*"
netcdf4 = "*"

[tool.poetry.group.dask.dependencies]
dask = "*"

[tool.poetry.plugins."xarray.backends"]
selafin = "xarray_selafin.xarray_backend:SelafinBackendEntrypoint"

[tool.poetry.group.dev.dependencies]
matplotlib = "*"

I see already a few fixes to address these issues:

tomsail commented 5 months ago

Tasks to close this issue:

tomsail commented 5 months ago

this simple function (method found on SO):

def signed_area(vertices):
    # Initialize area
    area = 0.0
    # Calculate the signed area
    for i in range(len(vertices)):
        x1, y1 = vertices[i]
        x2, y2 = vertices[(i + 1) % len(vertices)]
        area += (x2 - x1)*(y2 + y1)
    return area / 2.0

does the job.

I have compared it this way:

import matplotlib.pyplot as plt
from shapely import LinearRing
from shapely.geometry import Polygon

def signed_area(vertices, ax):
    # Initialize area
    area = 0.0
    # Calculate the signed area
    for i in range(len(vertices)):
        x1, y1 = vertices[i]
        x2, y2 = vertices[(i + 1) % len(vertices)]
        ax.plot([x1, x2], [y1, y2])
        ax.plot(*zip(*vertices))
        area += (x2 - x1)*(y2 + y1)
    return area / 2.0

def is_ccw(vertices, ax):
    # A positive signed area indicates a counter-clockwise orientation
    return signed_area(vertices, ax) < 0

# Example usage:
polygon =  [[[173.283, -39.474],[172.829, -39.507],[172.391, -39.607],[171.985, -39.769],[171.625, -39.988],
            [171.326, -40.255],[171.098, -40.562],[170.952, -40.896],[170.894, -41.245],[170.927, -41.597],
            [171.052, -41.936],[171.265, -42.251],[171.560, -42.528],[171.924, -42.756],[172.345, -42.926],
            [172.804, -43.031],[173.283, -43.067],[173.763, -43.031],[174.222, -42.926],[174.643, -42.756],
            [175.007, -42.528],[175.302, -42.251],[175.515, -41.936],[175.640, -41.597],[175.673, -41.245],
            [175.615, -40.896],[175.469, -40.562],[175.241, -40.255],[174.942, -39.988],[174.582, -39.769],
            [174.176, -39.607],[173.738, -39.507],[173.283, -39.474],]]

polygon1 = [[[-122.412, 37.804],[-122.507, 37.778],[-122.501, 37.733],[-122.412, 37.804]],
            [[-122.449, 37.782],[-122.465, 37.779],[-122.454, 37.788],[-122.449, 37.782]],
            [[-122.484, 37.771],[-122.481, 37.761],[-122.494, 37.758],[-122.496, 37.769],
            [-122.484, 37.771]]]

polygon2 = [[(37.803, -122.408), (37.736, -122.491), (37.738, -122.380), (37.787, -122.39)],
            [(37.760, -122.441), (37.772, -122.427), (37.773, -122.404), (37.758, -122.401), (37.745, -122.428)]]

for polys in [polygon, polygon1, polygon2]:
    fig, ax = plt.subplots()
    for poly in polys:
        p = Polygon(poly)
        print(abs(p.area) - abs(signed_area(poly, ax))<1e-14)
        ring = LinearRing(poly)
        assert(ring.is_ccw == is_ccw(poly,ax))
    plt.show()

we get as close as 10e-14 precision to the .area method from shapely.Polygon. I have tested more complex forms but I think this should do it.

pmav99 commented 5 months ago

Cool. This for sure works. The only question is whether the performance will be acceptable for a large number of boundaries.