shapely / shapely

Manipulation and analysis of geometric objects
https://shapely.readthedocs.io/en/stable/
BSD 3-Clause "New" or "Revised" License
3.88k stars 566 forks source link

subclassing Polygon, Point etc in shapely>=1.8 #1233

Open martvanrijthoven opened 2 years ago

martvanrijthoven commented 2 years ago

In shapely 1.8 I am getting a warning when subclassing Point (or any other geom) when I add a property (e.g., label: str):

from shapely.geometry import Point

class LabeledPoint(Point):
    def __init__(self, label: str, *args) -> None:
        self._label = label
        super().__init__(args)

    @property
    def label(self) -> str:
        return self._label

LabeledPoint('mylabel', 1, 2)

The warning says: ShapelyDeprecationWarning: Setting custom attributes on geometry objects is deprecated, and will raise an AttributeError in Shapely 2.0

How should I subclass Point and add the label property in shapely2.0?

jorisvandenbossche commented 2 years ago

@martvanrijthoven thanks for bringing this up.

I am not sure if we already discussed "subclassing" before in context of the 1.8 / 2.0 migration. We included those warnings about setting attributes because that will certainly no longer work in Shapely 2.0.

It might be that custom subclasses would still work, but certainly not as written above. A first problem is that we now use __new__, so your custom class will also have to use __new__ instead of __init__ to be able to override that. But I am not fully sure that will actually work (you could maybe try that out with the main branch of shapely?)

A second problem is that all the vectorized functions that are going the be added in Shapely 2.0 might accept a subclass, but will return a default shapely geometry if the output is a new geometry (although that is probably the case right now as well for the mehods?)


The reason this changed is because the base Geometry class in shapely 2.0 will be an extension type defined in C, and we have C functions that return those.

jorisvandenbossche commented 2 years ago

Using the main branch (and pygeos installed), I did a quick try with

class LabeledPoint(pygeos.Geometry):
    def __new__(cls, label: str, *args) -> None:
        self = super(pygeos.Geometry, cls).__new__(cls, *args)
        self._label = label
        return self

    @property
    def label(self) -> str:
        return self._label

LabeledPoint("label", "POINT (1 1)")

but that gives TypeError: object.__new__(LabeledPoint) is not safe, use pygeos.lib.Geometry.__new__()

I am not familiar enough with the new/init subclassing details to directly understand this error though, or to know what would solve it / if this would ever be possible to do.

sgillies commented 2 years ago

@jorisvandenbossche @martvanrijthoven the LabeledPoint class could override 1.8's __setattr__ to call object.__setattr__ instead of super().__setattr__ as at https://github.com/Toblerity/Shapely/blob/462de3aa7a8bbd80408762a2d5aaf84b04476e4d/shapely/geometry/base.py#L151. I think that might work but will make your classes a bit fragile.

Shapely 2.0 rewards composition and penalizes inheritance, I think it is fair to say. I think that's okay.

martvanrijthoven commented 2 years ago

Hi @jorisvandenbossche and @sgillies

Thank you very much for the explanation and possible solutions.

That shapely is using __new__ might not be a problem as LabeledPoint will inherit that method from Point and I can still create a custom __init__. I see the issue with vectorized functions returning default shape geometries. However, this problem might be solved with a constructor class method that wraps the default geometry with the Labeled version (please see below in the code snippet, an example that wraps for centroid). Though, this approach will probably not work with something like STRtree.

Overriding __setattr__ will solve the problem of creating subclasses, though indeed might make it fragile. I will check if using composition works better for my use case.

Just for reference, I put here below a code snippet that seems to work for shapely2.0

from shapely.geometry.point import Point
from __future__ import annotations

class LabeledPoint(Point):

    @classmethod
    def construct(cls, self: LabeledPoint, point: Point):
        return cls(self.label, point.x, point.y, point.z)

    def __init__(self, label: str, *args) -> None:
        self._label = label
        super().__init__(args)

    def __setattr__(self, name, value) -> None:
        object.__setattr__(self, name, value)

    # wrapping centroid into a LabeledPoint
    @property
    def centroid(self) -> LabeledPoint:
        return LabeledPoint.construct(self, super().centroid)

    #TODO wrap other methods that return default shapely geometries

    @property
    def label(self) -> str:
        return self._label

I will close this issue as I think this issue is solved. Thanks for the help!

Kind regards, Mart

jorisvandenbossche commented 2 years ago

Just for reference, I put here below a code snippet that seems to work for shapely2.0

This gives me an error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-2-7b107b59c8f5> in <module>
----> 1 LabeledPoint("label", "POINT (1 1)")

~/scipy/repos/Shapely/shapely/geometry/point.py in __new__(self, *args)
     68         else:
     69             # 2 or 3 args
---> 70             geom = pygeos.points(*args)
     71 
     72         if not isinstance(geom, Point):

~/scipy/repos/pygeos/pygeos/decorators.py in wrapped(*args, **kwargs)
     78             for arr in array_args:
     79                 arr.flags.writeable = False
---> 80             return func(*args, **kwargs)
     81         finally:
     82             for arr, old_flag in zip(array_args, old_flags):

~/scipy/repos/pygeos/pygeos/creation.py in points(coords, y, z, indices, out, **kwargs)
     71     coords = _xyz_to_coords(coords, y, z)
     72     if indices is None:
---> 73         return lib.points(coords, out=out, **kwargs)
     74     else:
     75         return simple_geometries_1d(coords, indices, GeometryType.POINT, out=out)

TypeError: ufunc 'points' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

This is because the "label" is also passed to the parent Point constructor, and doesn't support such string type.

martvanrijthoven commented 2 years ago

I am sorry, i tested the code in the wrong environment. Now i see the problem with __new__.

Jeremiah-England commented 2 years ago

For what it's worth, I mess around with this some because I didn't want to refactor the dependents of something that subclassed. I came up with this. Note that I did not have time to dive into overriding __setattr__ so this is probably more work than necessary.

class PropertyPoint(Point):
    _id_to_attrs = {}
    __slots__ = Point.__slots__  # slots must be the same for assigning __class__ - https://stackoverflow.com/a/52140968

    def __init__(self, coord: Tuple[float, float], name: str):
        self._id_to_attrs[id(self)] = dict(name=name)

    def __new__(cls, coord: Tuple[float, float], name: str):
        point = super().__new__(cls, coord)
        point.__class__ = cls
        return point

    def __del__(self):
        del self._id_to_attrs[id(self)]

    @property
    def name(self):
        return self._id_to_attrs[id(self)]['name']

    @name.setter
    def name(self, name):
        self._id_to_attrs[id(self)]['name'] = name

    def __str__(self):
        return f"{self.name}, {self.wkt}"

if __name__ == '__main__':
    zero = PropertyPoint((0, 0), 'zero')
    one = PropertyPoint((1, 1), 'one')
    print(f"Distance from '{zero}' to '{one}': {zero.distance(one)}")
    assert isinstance(one, Point)
    assert isinstance(one, PropertyPoint)
    del zero
    assert len(PropertyPoint._id_to_attrs) == 1
felixdivo commented 2 years ago

I tried something similar in #1313 but failed. However, your example @Jeremiah-England works great, thank you!

I improved it to not require the @property declarations by using __getattr__() and included some type annotations. Since "Geometry objects will become immutable in version 2.0.0.", not implementing a setter and simply assigning everything in the __init__() could even be considered a feature to better align with shapely semantics.

from typing import Any, ClassVar, Dict, Tuple

from shapely.geometry import Point

class PropertyPoint(Point):

    _id_to_attrs: ClassVar[Dict[str, Any]] = {}

    __slots__ = Point.__slots__  # slots must be the same for assigning __class__ - https://stackoverflow.com/a/52140968

    name: str  # For documentation generation and static type checking

    def __init__(self, coord: Tuple[float, float], name: str) -> None:
        self._id_to_attrs[id(self)] = dict(name=name)

    def __new__(cls, coord: Tuple[float, float], *args, **kwargs) -> "PropertyPoint":
        point = super().__new__(cls, coord)
        point.__class__ = cls
        return point

    def __del__(self) -> None:
        del self._id_to_attrs[id(self)]

    def __getattr__(self, name: str) -> Any:
        try:
            return PropertyPoint._id_to_attrs[id(self)][name]
        except KeyError as e:
            raise AttributeError(str(e)) from None

    def __str__(self) -> str:
        return f"{self.name}, {self.wkt}"

if __name__ == '__main__':
    zero = PropertyPoint((0, 0), 'zero')
    one = PropertyPoint((1, 1), 'one')
    print(f"Distance from '{zero}' to '{one}': {zero.distance(one)}")
    assert isinstance(one, Point)
    assert isinstance(one, PropertyPoint)
    del zero
    assert len(PropertyPoint._id_to_attrs) == 1
jonasteuwen commented 2 years ago

With Shapely 1.8.2 and 1.8.0 I do get AttributeError: type object 'Point' has no attribute '__slots__'? Both on OS X and Linux. What's up there?

jorisvandenbossche commented 2 years ago

The __slots__ are only added in Shapely 2.0.0 (so in the current main branch), so I suppose the above snippet is targeting that.

ericchansen commented 1 year ago

I wrote out some examples on how to do this as simply as possible https://github.com/shapely/shapely/issues/1698.

felixdivo commented 1 year ago

What is the goal of the issue? To add documentation on how to subclass the geometries?

ericchansen commented 1 year ago

I'm not OP, but I found this thread while trying to subclass a geometry. The examples that I linked don't work once you start doing operations like copy.deepcopy.

@felixdivo does shapely have a recommendation on how to subclass a geometry?

felixdivo commented 1 year ago

does shapely have a recommendation on how to subclass a geometry?

No (not that I know of it), I but I think the goal of this issue should be to add documentation on how to do that kind of integration. It's currently documented in this issue as a kind of discussion, but something like your more comprehensive "document" in #1698 is much more helpful. Having it in the main documentation would also make it clear against which constraints library changes can be made and what users of the library can expect to work in the future and which is just a lucky quirk that makes sub-classing work "by accident".

jorisvandenbossche commented 1 year ago

Yes, for me those issues serve the purpose of discussing this topic, gathering options, and then as issue that we actually should document this.