opendatacube / datacube-core

Open Data Cube analyses continental scale Earth Observation data through time
http://www.opendatacube.org
Apache License 2.0
504 stars 176 forks source link

Psycopg2 serialization failure when indexing using datacube 1.8.11 #1412

Closed alexgleith closed 1 year ago

alexgleith commented 1 year ago

Expected behaviour

Actual behaviour

sqlalchemy.exc.OperationalError: (psycopg2.errors.SerializationFailure) could not serialize access due to read/write dependencies among transactions
DETAIL:  Reason code: Canceled on identification as a pivot, during conflict in checking.
HINT:  The transaction might succeed if retried.

[SQL: INSERT INTO agdc.dataset_location (dataset_ref, uri_scheme, uri_body) VALUES (%(dataset_ref)s, %(uri_scheme)s, %(uri_body)s) ON CONFLICT (uri_scheme, uri_body, dataset_ref) DO NOTHING RETURNING agdc.dataset_location.id]
[parameters: {'dataset_ref': UUID('e09c981d-f44c-510a-855f-ef67e4b6f4e8'), 'uri_scheme': 'https', 'uri_body': '//planetarycomputer.microsoft.com/api/stac/v1/collections/io-lulc-9-class/items/29U-2019'}]
(Background on this error at: https://sqlalche.me/e/14/e3q8)

Steps to reproduce the behaviour

Failures are intermittent. The first run it indexed 5 datasets and failed 45. The second it indexed 2, skipped 5 and failed 43!

Environment information

Full environment is this: https://gist.github.com/alexgleith/83813d30bc0798581e9c0b530f9f13bc

alexgleith commented 1 year ago

Note that with datacube==1.8.9 indexing works as expected.

pindge commented 1 year ago

related: https://github.com/opendatacube/odc-tools/issues/538

alexgleith commented 1 year ago

Don't think they are related, @pindge ...

omad commented 1 year ago

I started hitting this today too! I was converting a test in odc-tools (See https://github.com/opendatacube/odc-tools/pull/555) to access local data instead of over the network. It happens almost every time I try to run the test.

It looks a lot like a threading/connection issue in the new DB code. Either an incompatibility with how ODC connections have been used previously, or, just a bug.

emmaai commented 1 year ago

Bothered by this too, it happened to me very randomly though, even locally. But I don't think it's related to datacube version, at least not only datacube version. Got one env built with datacube=1.8.11, it's just fine. Haven't figured out what exactly the cause is.

SpacemanPaul commented 1 year ago

Appears to be a transaction handling issue. Damien has done a great job tracking it down to a few lines of code in a single commit, and we can reproduce reasonably reliably now. I'm hoping to look at it this afternoon.