Closed davoustp closed 1 year ago
I've been wondering if anyone uses PostgreSQL transport in rebus, so now I know! When I originally ported the transport from MSSQL years ago I'm not sure why I used clock_timestamp, so I've fixed as you suggested. Thanks for the detailed analysis, was very interesting.
@jmkelly Any known reason that this same cleanup query would run ~150x per second and be rolled back every time? It executes successfully, but I guess since this method takes in an existing transaction, something down the line fails and causes even this successful command to be rolled back.
@sjd2021 Short answer is i don’t know a reason. In the micro benchmarks I’ve run it’s not showing up.
I do find it odd that the within the cleanup method the delete is called in a while loop that exits when no records are deleted. Given this cleanup is called on a 60s timer, I think the loop could be unnecessary.
Do you have an example project I could run up locally to test?
@jmkelly I hope to have a reproducible example soon. I've only used it with Elsa thus far and have a hard time figuring out where the culprit lies.
Thanks for the quick response
Should be fixed by @jmkelly in #39
It's out as Rebus.Postgresql 9.0.0-alpha04 on NuGet.org now 🙂
Do you have an example project I could run up locally to test?
@jmkelly I've submitted a new issue with a repro: https://github.com/rebus-org/Rebus.PostgreSql/issues/43
Symptom
The Postgres engine is hit heavily, showing high CPU usage (between 0.5 and 1 full CPU core).
Analysis
Connect the Postgres engine, then:
shows:
Understanding why these requests are so expensive by looking at the execution plan:
Result:
This is a table full scan, which is obviously the root cause of the high resource consumption.
The main problem is that the query is not sargable (indexes cannot be used). Why? Because the where clause uses a non-constant value:
clock_timestamp()
This is described in https://www.postgresql.org/docs/current/functions-datetime.html#FUNCTIONS-DATETIME-CURRENT :
Obviously, since its value changes during statement execution, Postgres MUST invoke it at each row to check the condition... This is very inefficient, and probably performs a system call each time, which is even less efficient.
Proposed change
The obvious fix is either to compute the current date client-side, or use another function such as
now()
(equivalent totransaction_timestamp()
), which means that this value will be constant across the entire transaction.The execution then becomes:
Output:
Now it uses the index, which is better, but the cost is still high.
This can be further optimized by adding a specific index onto the
expiration
column :The execution plan is then:
The cost is now 1.55 compared to the initial 15127.39 - 4 orders of magnitude lower.
This is easily done in https://github.com/rebus-org/Rebus.PostgreSql/blob/master/Rebus.PostgreSql/PostgreSql/Transport/PostgresqlTransport.cs :
clock_timestamp()
bynow()