Closed orf closed 2 years ago
FYI we deployed this change and saw an immediate drop in our write throughput,
network traffic:
and write IOPs.
can you confirm this won't create any regression?
Is it ever possible to confirm that it won’t cause any regression?
All I can say is that the current code doesn’t need to update anything other than the count column, and that I’ve validated this on complex real world workflows and it works as expected.
One of our larger chords results in a ~7mb JSON string being updated every time “save” is called, which may be multiple times per second. Our internal monitoring showed Postgres triggered about ~4,700 “internal” row updates and deletes to complete this, which causes significant stress on the various IO subsystems within PG.
Throughout the execution of all of these queries, the JSON string remained the same.
yup legit, make sense. thanks a lot
With the current implementation the following SQL query will be continually executed on every task:
If you are using a large number of sub-tasks, ranging in the thousands or tens of thousands, then continually sending the
sub_tasks
in every update can be very expensive.If we use
update_fields
then we can skip this.