Open DMRobertson opened 11 months ago
The API should probably be redesigned so that you can paginate on (received_ts, id)
tuples (assuming there is an index on received_ts
); the API should handback some sort of encoded (received_ts, id)
token to the client and the client should pass that when trying to make a request next time.
As you say, LIMIT ? OFFSET ?
is likely to blame here since it has to scan the index anyway.
I find myself somewhat surprised that we have so many reports that it takes 10 seconds to scan the index. Apparently we have 91k of them.
But there is no index on received_ts
! A short-term fix would be to add an index on that.
We could also rewrite the query a bit so it doesn't perform a join before the LIMIT ? OFFSET ?
: I believe that currently it will join to the events table even for the rows that aren't selected, because a LEFT JOIN
might produce multiple rows (in theory, unless it's smart enough to know event_id
is unique) and so it needs to check that. (probably similar logic for the room_stats_state
...)
E.g. from Jaeger:
My money is on LIMIT ... OFFSET ... being slow. I wonder if we could change this to paginate without OFFSET using the
id
column? (I assume thatreceived_ts
increases asid
increases, and vice versa).