Closed MFAshby closed 2 years ago
It appears the same problem happens for ordinary searches as well specified without an order, e.g. GET /fhir/Appointment?actor:Patient.identifier=https://fhir.nhs.uk/Id/nhs-number|9999999999
issues a query that uses LIMIT
and OFFSET
without an ORDER BY
@MFAshby you could use _sort=id for search api. also fix for $changes should be available on edge, would be great if you can check it and confirm that it works
@krevedkokun thanks for the suggestion. I tried this, but it does not change the query that is issued to postgres.
I also think I found a related bug. The page=
query parameter appears to be ignored, and actually should be _page=
, but the first, self, next, previous and last links specify page=
Aidbox version :2202-lts v:2202.5426d8df
Sample request:
GET /Patient/$changes?version=0&_sort=id&page=2
returns the following links:
total: 100000
link:
- relation: first
url: /Patient/$changes?version=0&_sort=id&page=1
- relation: self
url: /Patient/$changes?version=0&_sort=id&page=2
- relation: next
url: /Patient/$changes?version=0&_sort=id&page=3
- relation: previous
url: /Patient/$changes?version=0&_sort=id&page=1
- relation: last
url: /Patient/$changes?version=0&_sort=id&page=1000
BUT each of those links just returns exactly the same set of patients. The page=
parameter is being ignored.
If I use this instead:
GET /Patient/$changes?version=0&_sort=id&_page=2
then the links segment looks like this
total: 100000
link:
- relation: first
url: /Patient/$changes?version=0&_sort=id&_page=1
- relation: self
url: /Patient/$changes?version=0&_sort=id&_page=2
- relation: next
url: /Patient/$changes?version=0&_sort=id&_page=3
- relation: previous
url: /Patient/$changes?version=0&_sort=id&_page=1
- relation: last
url: /Patient/$changes?version=0&_sort=id&_page=1000
version: 7442
and the pages contain different results as expected.
I don't think this part is a major bug because it works correctly when using the right parameter, but I would have expected that if page=
isn't a valid query parameter that it is ignored and the appropriate _page=
parameters are added to the links instead. As it is, misleading links are returned.
@krevedkokun my apologies, I misread your message
you could use _sort=id for search api.
yes this works, thank you
also fix for $changes should be available on edge, would be great if you can check it and confirm that it works
I will try this asap
@MFAshby
I will try this asap
hi, any news on that?
Hi @krevedkokun , yes it looks like the problem is fixed in :edge, thank you
I turned on query logging in the database, and made a query to /Patient/$changes?version=0 and the following query was executed. It looks like this list should be deterministic now, thanks
2022-05-31 10:00:48.347 UTC [205]:[21] user=customer1,db=customer1,client=172.20.0.7 LOG: execute <unnamed>: SELECT * FROM ((SELECT * FROM "patient" WHERE ("txid" > $1)) UNION ALL (SELECT * FROM "patient_history" WHERE ("txid" > $2))) "_" ORDER BY "txid", "id" LIMIT $3 OFFSET $4
2022-05-31 10:00:48.347 UTC [205]:[22] user=customer1,db=customer1,client=172.20.0.7 DETAIL: parameters: $1 = '0', $2 = '0', $3 = '100', $4 = '0'
Will the same fix be applied to -lts version?
@MFAshby yes, we'll port it to LTS and we'll get back to you
Hi, @MFAshby. This fix is avaliable on 2202-lts channel
Describe the bug
Results might be skipped or duplicated when paging over results of $changes API,
Severity
Major
Steps to reproduce the behavior:
Create a new box. Use the bulk upload facility to upload more than 100 resources e.g. appointments. Query the $changes API, the page limit is 100 Page through the results. It might skip some results.
Expected behavior
All of the results are seen when paging through.
Screenshots
Versions:
Additional context
It's hard to reproduce this bug, in fact it was spotted by inspection of an SQL query rather than directly observed.
The $changes API appears to issue a query like this
The OFFSET will change with the page number. The problem with this query is that although it has an ORDER BY "txid", the "txid" is the same for every appointment, and there is no secondary sort order which might make this pagination unstable. Without a secondary sort order, postgres simply returns the data in the order in some default order, which is not guaranteed to be stable [1].
Suggested fix: change the ordering to ORDER BY "txid","id" in order to be able to page over a stable set of results.
[1] https://www.postgresql.org/docs/14/queries-limit.html