electric-sql / electric

Sync little subsets of your Postgres data into local apps and services.
https://electric-sql.com
Apache License 2.0
6.42k stars 154 forks source link

Shapes with where clauses are missing updates #1659

Closed kevin-dp closed 2 months ago

kevin-dp commented 2 months ago

When a transaction inserts several rows that are partitioned over several shapes, those shapes fail to pick up the newly inserted rows.

Reproduction

Start Electric:

cd packages/sync-service
mix stop_dev && mix compile && mix start_dev && iex -S mix

Connect to Postgres and create a table:

psql postgresql://postgres:password@localhost:54321/electric
CREATE TABLE foo (a INT PRIMARY KEY, b TEXT);
INSERT INTO foo VALUES (1, 'a'), (2, 'b'), (5, 'c');

Now, get a shape that fetches all rows where a > 2 (i'm using HTTPie here):

HTTP GET "localhost:3000/v1/shape/foo" offset==-1 where=="a > 2"

This works fine, the result is:

HTTP/1.1 200 OK
access-control-allow-methods: GET, POST, OPTIONS
access-control-allow-origin: *
access-control-expose-headers: *
cache-control: max-age=1, stale-while-revalidate=3
content-type: application/json; charset=utf-8
date: Tue, 10 Sep 2024 09:21:37 GMT
etag: 71923130-1725960097628:-1:0_0
server: ElectricSQL/0.3.7
transfer-encoding: chunked
x-electric-chunk-last-offset: 0_0
x-electric-schema: {"a":{"type":"int4","not_null":true,"pk_index":0},"b":{"type":"text"}}
x-electric-shape-id: 71923130-1725960097628
x-request-id: F_PXkHBBcfosEh0AAAAl

[
    {
        "headers": {
            "operation": "insert",
            "relation": [
                "public",
                "foo"
            ]
        },
        "key": "\"public\".\"foo\"/\"5\"",
        "offset": "0_0",
        "value": {
            "a": "5",
            "b": "c"
        }
    },
    {
        "headers": {
            "control": "up-to-date"
        }
    }
]

Now, let's insert some more rows into the table:

INSERT INTO foo VALUES (3, 'o'), (9, 'f');

Now, let's fetch the shape again from the last offset we know:

HTTP GET "localhost:3000/v1/shape/foo" offset==0_0 shape_id==71923130-1725960097628

Now Electric replies with:

HTTP/1.1 200 OK
access-control-allow-methods: GET, POST, OPTIONS
access-control-allow-origin: *
access-control-expose-headers: *
cache-control: max-age=1, stale-while-revalidate=3
content-type: application/json; charset=utf-8
date: Tue, 10 Sep 2024 09:24:39 GMT
etag: 3833821-1725960279599:0_0:0_0
server: ElectricSQL/0.3.7
transfer-encoding: chunked
x-electric-chunk-last-offset: 0_0
x-electric-schema: {"a":{"type":"int4","not_null":true,"pk_index":0},"b":{"type":"text"}}
x-electric-shape-id: 3833821-1725960279599
x-request-id: F_PXutH0jEG32QsAAAAj

[
    {
        "headers": {
            "control": "up-to-date"
        }
    }
]

**This is wrong. We were expecting to get the 2 rows we inserted.

Now insert some more rows:

INSERT INTO foo VALUES (11, 'g'), (20, 'l');

And fetch again:

HTTP GET "localhost:3000/v1/shape/foo" offset==0_0 shape_id==71923130-172596009762

We see the inserted rows:

HTTP/1.1 200 OK
access-control-allow-methods: GET, POST, OPTIONS
access-control-allow-origin: *
access-control-expose-headers: *
cache-control: max-age=1, stale-while-revalidate=3
content-type: application/json; charset=utf-8
date: Tue, 10 Sep 2024 09:27:17 GMT
etag: 3833821-1725960279599:0_0:26804896_2
server: ElectricSQL/0.3.7
transfer-encoding: chunked
x-electric-chunk-last-offset: 26804896_2
x-electric-schema: {"a":{"type":"int4","not_null":true,"pk_index":0},"b":{"type":"text"}}
x-electric-shape-id: 3833821-1725960279599
x-request-id: F_PX37SJoymBkLkAAAAk

[
    {
        "headers": {
            "operation": "insert",
            "relation": [
                "public",
                "foo"
            ],
            "txid": 750
        },
        "key": "\"public\".\"foo\"/\"11\"",
        "offset": "26804896_0",
        "value": {
            "a": "11",
            "b": "g"
        }
    },
    {
        "headers": {
            "operation": "insert",
            "relation": [
                "public",
                "foo"
            ],
            "txid": 750
        },
        "key": "\"public\".\"foo\"/\"20\"",
        "offset": "26804896_2",
        "value": {
            "a": "20",
            "b": "l"
        }
    },
    {
        "headers": {
            "control": "up-to-date"
        }
    }
]
icehaunter commented 2 months ago

Shapes with a where clause must include a where clause across all requests, otherwise they are treated as requests for a different shape definition (i.e. the one without a where clause). This is what's happening here, you can see that the response to the first request with offset==0_0 has a different shape id in the response than in the request. I think that's an API deficiency - we should catch that or rethink the url structure - but it behaves correctly given current spec

thruflo commented 2 months ago

I would suggest that if the shape_id is provided and exists then it should be validated as matching the shape definition and if not the request should return 400.

KyleAMathews commented 2 months ago

I agree we should validate that shape_id must match the rest of the URl structure.

KyleAMathews commented 2 months ago

I'll create a new issue for that.