Closed MrChadMWood closed 2 months ago
Actually, I think the a solution is to change the ordering to ORDER BY id, updated_at
(put id
first). I'm curious if any edge cases could still arise from this, but its working for me right now.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed soon. Please update it or it may be closed to keep our repository organized. The best way is to add some more information or make a pull request with a test case. Also you might get help in fixing it at the RxDB Community Chat If you know you will continue working on this, just write any message to the issue (like "ping") to remove the stale tag.
Actually, I think the a solution is to change the ordering to ORDER BY id, updated_at
This does not sound correct. The ordering should be first by updated_at and only if two documents have the same update_at, it should fall back to comparing the id.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed soon. Please update it or it may be closed to keep our repository organized. The best way is to add some more information or make a pull request with a test case. Also you might get help in fixing it at the RxDB Community Chat If you know you will continue working on this, just write any message to the issue (like "ping") to remove the stale tag.
@pubkey Maybe I'm misunderstanding, but isn't that approach what caused my implementation to have missing data after replication finished?
Edit: No, I see. Its as you said "first by updated_at and only if two documents have the same update_at, it should fall back to comparing the id." I was unconditionally ordering by updated_at, id
each time. Thanks!
I'm finding a unique issue, likely due to how I've implemented the HTTP replication protocol. I'm curious what the right approach would be here, given my approach is clearly producing the wrong result. Note: I am using Postgres, relational data on the backend, if that matters.
The issue seems to come from my attempt to make the results of queries deterministic. I
ORDER BY updated_at, id
, in combination with theLIMIT
clause, which results in missing data on the client side. I demonstrate a markdown example below:Breakdown of the initial replication process for the INITIAL client:
/pull?updated_at=0&id=&limit=3
Query:SELECT * FROM table WHERE updated_at >= 0 ORDER BY updated_at, id LIMIT 3
Result: id/pull?updated_at=0&id=3&limit=3
Query:SELECT * FROM table WHERE updated_at >= 0 AND id > 3 ORDER BY updated_at, id LIMIT 3
Result: id/pull?updated_at=0&id=6&limit=3
Query:SELECT * FROM table WHERE updated_at >= 0 AND id > 6 ORDER BY updated_at, id LIMIT 3
Result: id/pull?updated_at=0&id=9&limit=3
Query:SELECT * FROM table WHERE updated_at >= 0 AND id > 9 ORDER BY updated_at, id LIMIT 3
Result: idFirst update from client: Request:
/push { newDocumentState: {id: 6, data: "z", ...} }
/pull?updated_at=0&id=&limit=3
Query:SELECT * FROM table WHERE updated_at >= 0 ORDER BY updated_at, id LIMIT 3
Result: id/pull?updated_at=0&id=3&limit=3
Query:SELECT * FROM table WHERE updated_at >= 0 AND id > 3 ORDER BY updated_at, id LIMIT 3
Result: id/pull?updated_at=0&id=7&limit=3
Query:SELECT * FROM table WHERE updated_at >= 0 AND id > 7 ORDER BY updated_at, id LIMIT 3
Result: id/pull?updated_at=0&id=9&limit=3
Query:SELECT * FROM table WHERE updated_at >= 0 AND id > 9 ORDER BY updated_at, id LIMIT 3
Result: idThe issue:
Notice that, for the second client, they never receive the record where
id=6
. This is because theORDER BY
clause would force the 6th record to be the last record, while theLIMIT
clause prevents its delivery to the client. The subsequent requests move pastid=6
, meaning the record gets filtered out of all subsequent requests.I'm sure that I'm misinterpreting something about the HTTP protocol here, but what could it be? I see that the docs don't include an
ORDER BY
clause anywhere, but even still... wouldn't removing that just make the query non-deterministic by sort order?My guess is that I should simply
ORDER BY id
(remove the updated_at). Can this be confirmed as the ideal solution though? Also, would you be open to the addition of a warning for this scenario in the documentation?