geoalchemy / geoalchemy2

Geospatial extension to SQLAlchemy
http://geoalchemy-2.readthedocs.org
MIT License
640 stars 112 forks source link

[Bug report] SQLite rejects negative coordinates when column has srid=4326 #516

Closed DFEvans closed 4 months ago

DFEvans commented 4 months ago

Describe the bug

When using spatialite, inserting a geometry with negative coordinates to a column with srid=4326 is rejected as it "violates Geometry constraint [geom-type or SRID not allowed]", even though the coordinates supplied should be valid for WGS84.

Optional link from https://geoalchemy-2.readthedocs.io which documents the behavior that is expected

https://geoalchemy-2.readthedocs.io/en/stable/spatialite_tutorial.html

To Reproduce

import os
from typing import Any
from geoalchemy2 import Geometry as GeoAlchemyGeometry, load_spatialite
from sqlalchemy.orm import registry
from sqlalchemy.event import listen
from sqlalchemy.orm import declarative_base, sessionmaker
from sqlalchemy import Column, Integer, String, create_engine

connection_string = "sqlite+pysqlite:///:memory:"
os.environ["SPATIALITE_LIBRARY_PATH"] = "mod_spatialite.so"

Base = declarative_base()

class Lake(Base):
    __tablename__ = "lake"
    id = Column(Integer, primary_key=True)
    name = Column(String)
    geom = Column(GeoAlchemyGeometry("POLYGON", srid=4326))

def main():
    engine = create_engine(connection_string, echo=True)
    listen(engine, "connect", load_spatialite)

    Lake.__table__.create(engine)

    Session = sessionmaker(bind=engine)
    session = Session()

    lake = Lake(name="Good", geom="POLYGON((0 0,1 0,1 1,0 1,0 0))")
    session.add(lake)
    session.commit()

    lake = Lake(name="Bad", geom="POLYGON((0 -1,1 -1,1 0,0 0,0 -1))")
    session.add(lake)
    session.commit()

if __name__ == "__main__":
    main()

Error

2024-07-10 14:50:26,086 INFO sqlalchemy.engine.Engine BEGIN (implicit)
before_create
2024-07-10 14:50:26,086 INFO sqlalchemy.engine.Engine
CREATE TABLE lake (
        id INTEGER NOT NULL,
        name VARCHAR,
        geom GEOMETRY,
        PRIMARY KEY (id)
)

2024-07-10 14:50:26,087 INFO sqlalchemy.engine.Engine [no key 0.00012s] ()
2024-07-10 14:50:26,088 INFO sqlalchemy.engine.Engine SELECT RecoverGeometryColumn(?, ?, ?, ?, ?) AS "RecoverGeometryColumn_1"
2024-07-10 14:50:26,088 INFO sqlalchemy.engine.Engine [generated in 0.00013s] ('lake', 'geom', 4326, 'POLYGON', 'XY')
2024-07-10 14:50:26,089 INFO sqlalchemy.engine.Engine SELECT CreateSpatialIndex(?, ?) AS "CreateSpatialIndex_1"
2024-07-10 14:50:26,089 INFO sqlalchemy.engine.Engine [generated in 0.00011s] ('lake', 'geom')
2024-07-10 14:50:26,090 INFO sqlalchemy.engine.Engine COMMIT
2024-07-10 14:50:26,091 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2024-07-10 14:50:26,092 INFO sqlalchemy.engine.Engine INSERT INTO lake (name, geom) VALUES (?, GeomFromEWKT(?))
2024-07-10 14:50:26,092 INFO sqlalchemy.engine.Engine [generated in 0.00012s] ('Good', 'SRID=4326;POLYGON((0 0,1 0,1 1,0 1,0 0))')
2024-07-10 14:50:26,094 INFO sqlalchemy.engine.Engine COMMIT
2024-07-10 14:50:26,094 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2024-07-10 14:50:26,094 INFO sqlalchemy.engine.Engine INSERT INTO lake (name, geom) VALUES (?, GeomFromEWKT(?))
2024-07-10 14:50:26,094 INFO sqlalchemy.engine.Engine [cached since 0.001953s ago] ('Bad', 'POLYGON((0 -1,1 -1,1 0,0 0,0 -1))')
2024-07-10 14:50:26,094 INFO sqlalchemy.engine.Engine ROLLBACK
Traceback (most recent call last):
  File "/home/danielevans/.cache/pypoetry/virtualenvs/image-processing-LWpwzOMN-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1967, in _exec_single_context
    self.dialect.do_execute(
  File "/home/danielevans/.cache/pypoetry/virtualenvs/image-processing-LWpwzOMN-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 924, in do_execute
    cursor.execute(statement, parameters)
sqlite3.IntegrityError: lake.geom violates Geometry constraint [geom-type or SRID not allowed]

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/danielevans/repositories/image-processing/satellite_data_models/geom_test2.py", line 42, in <module>
    main()
  File "/home/danielevans/repositories/image-processing/satellite_data_models/geom_test2.py", line 37, in main
    session.commit()
  File "/home/danielevans/.cache/pypoetry/virtualenvs/image-processing-LWpwzOMN-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 2017, in commit
    trans.commit(_to_root=True)
  File "<string>", line 2, in commit
  File "/home/danielevans/.cache/pypoetry/virtualenvs/image-processing-LWpwzOMN-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/state_changes.py", line 139, in _go
    ret_value = fn(self, *arg, **kw)
  File "/home/danielevans/.cache/pypoetry/virtualenvs/image-processing-LWpwzOMN-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 1302, in commit
    self._prepare_impl()
  File "<string>", line 2, in _prepare_impl
  File "/home/danielevans/.cache/pypoetry/virtualenvs/image-processing-LWpwzOMN-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/state_changes.py", line 139, in _go
    ret_value = fn(self, *arg, **kw)
  File "/home/danielevans/.cache/pypoetry/virtualenvs/image-processing-LWpwzOMN-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 1277, in _prepare_impl
    self.session.flush()
  File "/home/danielevans/.cache/pypoetry/virtualenvs/image-processing-LWpwzOMN-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 4341, in flush
    self._flush(objects)
  File "/home/danielevans/.cache/pypoetry/virtualenvs/image-processing-LWpwzOMN-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 4476, in _flush
    with util.safe_reraise():
  File "/home/danielevans/.cache/pypoetry/virtualenvs/image-processing-LWpwzOMN-py3.10/lib/python3.10/site-packages/sqlalchemy/util/langhelpers.py", line 146, in __exit__
    raise exc_value.with_traceback(exc_tb)
  File "/home/danielevans/.cache/pypoetry/virtualenvs/image-processing-LWpwzOMN-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 4437, in _flush
    flush_context.execute()
  File "/home/danielevans/.cache/pypoetry/virtualenvs/image-processing-LWpwzOMN-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/unitofwork.py", line 466, in execute
    rec.execute(self)
  File "/home/danielevans/.cache/pypoetry/virtualenvs/image-processing-LWpwzOMN-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/unitofwork.py", line 642, in execute
    util.preloaded.orm_persistence.save_obj(
  File "/home/danielevans/.cache/pypoetry/virtualenvs/image-processing-LWpwzOMN-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/persistence.py", line 93, in save_obj
    _emit_insert_statements(
  File "/home/danielevans/.cache/pypoetry/virtualenvs/image-processing-LWpwzOMN-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/persistence.py", line 1233, in _emit_insert_statements
    result = connection.execute(
  File "/home/danielevans/.cache/pypoetry/virtualenvs/image-processing-LWpwzOMN-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1418, in execute
    return meth(
  File "/home/danielevans/.cache/pypoetry/virtualenvs/image-processing-LWpwzOMN-py3.10/lib/python3.10/site-packages/sqlalchemy/sql/elements.py", line 515, in _execute_on_connection
    return connection._execute_clauseelement(
  File "/home/danielevans/.cache/pypoetry/virtualenvs/image-processing-LWpwzOMN-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1640, in _execute_clauseelement
    ret = self._execute_context(
  File "/home/danielevans/.cache/pypoetry/virtualenvs/image-processing-LWpwzOMN-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1846, in _execute_context
    return self._exec_single_context(
  File "/home/danielevans/.cache/pypoetry/virtualenvs/image-processing-LWpwzOMN-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1986, in _exec_single_context
    self._handle_dbapi_exception(
  File "/home/danielevans/.cache/pypoetry/virtualenvs/image-processing-LWpwzOMN-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 2353, in _handle_dbapi_exception
    raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
  File "/home/danielevans/.cache/pypoetry/virtualenvs/image-processing-LWpwzOMN-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1967, in _exec_single_context
    self.dialect.do_execute(
  File "/home/danielevans/.cache/pypoetry/virtualenvs/image-processing-LWpwzOMN-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 924, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.IntegrityError: (sqlite3.IntegrityError) lake.geom violates Geometry constraint [geom-type or SRID not allowed]
[SQL: INSERT INTO lake (name, geom) VALUES (?, GeomFromEWKT(?))]
[parameters: ('Bad', 'POLYGON((0 -1,1 -1,1 0,0 0,0 -1))')]
(Background on this error at: https://sqlalche.me/e/20/gkpj)

Additional context

From reading elsewhere, rejecting negative coordinates is expected behaviour if a column has an undefined srid. However, I think I'm defining it correctly - if nothing else, the raw SQL emitted to create the table includes the expected 4326 in the call to RecoverGeometryColumn.

The linked tutorial for Spatialite use doesn't explicitly show the use of SRID or state that they're supported, but equally, it doesn't mention any restrictions on compatibility vs. a PostGIS-backed database.

GeoAlchemy 2 Version in Use

0.15.1

Python Version

3.10

Operating system

Linux

DFEvans commented 4 months ago

Aha, digging deeper, the cause of the upset seems to be that I'm specifying the SRID on the column, but it's unset on the geometry. Is this the expected behaviour when that occurs? The documentation doesn't go into much detail on how it interacts.

(I'm also now unsure if this is GeoAlchemy logic or Spatialite logic - apologies if it's the latter)

DFEvans commented 4 months ago

Trying to take that knowledge back to my actual code, I was surprised to find that explicitly setting the SRID on my geometry when calling from_shape wasn't enough.

The issue seems to be that the bind_processor_process for sqlite isn't behaving properly. I can see that it tries very hard to provide an SRID if it can, either using the SRID from the value or assuming that of the column, but the values that come out in several cases either don't add the SRID, or can actually strip it entirely when an EWKB is passed in (my from_shape case).

The underlying cause is that WKTElement.SPLIT_WKT_PATTERN, a regex used to split WKT strings, assumes that the coordinate section of a WKT string consists of digits, decimal points, spaces, commas, and brackets only - significantly, that list excludes minus signs (dashes). When the match fails, the WKT just gets passed through without an SRID being prepended (for EWKB, the conversion involves an intermediate WKT gets created).

DFEvans commented 4 months ago

Same root cause as #509, it seems.

Fairly simple to do an incremental fix to the WKT regex. However, with this being the second issue, the relative simplicity of the regex, and the potential complexity of WKT strings, I'm worried that there remain plenty of undiscovered issues and the regex probably needs a significant revision by someone who's far more familiar with WKT parsing than I am. For another example, a GeometryCollection like this example from the WKT spec doesn't parse correctly:

GeometryCollection(POINT (10 10),POINT (30 30),LINESTRING (15 15, 20 20))
adrien-berchet commented 4 months ago

Hi @DFEvans Thank you very much for this very detailed report! You are absolutely right, we missed the negative coordinate case. I'm going to push a quick fix for this. For your last comment, it's true that we only support simple WKT strings. I think it is able to cover most use cases but we may have to improve it at some point...

adrien-berchet commented 4 months ago

@DFEvans could you test the branch of #517 for your use case please?

DFEvans commented 4 months ago

I think it is able to cover most use cases but we may have to improve it at some point...

Is it worth emitting a warning/log message? Thinking of something in the if match is None: return wkt branch in format_geom_type.

DFEvans commented 4 months ago

@DFEvans could you test the branch of https://github.com/geoalchemy/geoalchemy2/pull/517 for your use case please?

Yep, that does it - thanks!

adrien-berchet commented 4 months ago

I think it is able to cover most use cases but we may have to improve it at some point...

Is it worth emitting a warning/log message? Thinking of something in the if match is None: return wkt branch in format_geom_type.

Indeed, let's add a warning here.

adrien-berchet commented 4 months ago

The new release 0.15.2 is available and should fix this issue.