Open imliuda opened 2 months ago
there are many slow queries of mysql when running thousands of workflows parallelly, most of which is insert into argo_workflows table
add appropriate indexes to archive_workflows and argo_workflows table
If it's slow on inserts, I don't think an index would improve that performance. Otherwise it would be good to know what slow queries you see on what version of Argo
- compress nodes statues or workflow before save it to mysql
We might be able to do this for status.nodes
specifically, but other parts of the status
for archived workflows are used during queries (see also #13203).
- considering using redis or s3 to save node status or archived workflows
There's an existing feature request for blob storage: #4162. It would be insufficient for archived workflows though, given that they won't be queryable as a blob, as one of the comments lays out (as well as some workarounds). Similarly, I'm not sure Redis/Valkey or similar would suffice for archived workflows queries (it potentially could, but may require some transactions, denormalization, and/or manual indexes).
But both might suffice for status offloading 🤔
and daily archived workflows cleaning-up may take very long time
We might want to add configurable workers for this; i.e. --archive-ttl-workers
If it's slow on inserts, I don't think an index would improve that performance. Otherwise it would be good to know what slow queries you see on what version of Argo
I add following index, workflows that Error caused by Dead lock found when trying to get lock; trying restarting transaction
disappear, I guess it is caused by delete query right next to the successful insert sql in the OffloadNodeStatusRepo.Save()
method.
alter table argo_workflows add index `argo_workflows_i2` (`clustername`,`uid`, `version`, `updatedat`);
Hmm looks like the argo_workflows
table may not have indexes? #8860 added them for archived workflows (which do have different query patterns).
Would you like to submit a PR to add them for offloaded Workflows?
I guess it is caused by delete query right next to the successful insert sql in the
OffloadNodeStatusRepo.Save()
method.
Yea that might make more sense since a deletion has to do a lookup. That specific deletion may be removed in #13286
Otherwise it would be good to know what slow queries you see on what version of Argo
This would still be useful -- your Server and Controller logs should show any slow queries
argo version is v3.4.8, most of slow query is insert.
When there are lots of slow queries, that delete may cause deadlock or lock wait timeout, I think https://github.com/argoproj/argo-workflows/pull/13286 will help, if this pr is merged, I think it is not necessary of above secondary index. But if we remove that delete code, is there a situation that table records increase continuously due to periodic gc speed is slower then insert?
@agilgur5 Anton, I opened a pull request, add some code to compress nodes for both offloading and archiving, can you help review this code when you you have time?
argo version is v3.4.8, most of slow query is insert.
@imliuda when providing logs, please use text instead of images, as text is much more accessible.
Otherwise that's a great data point, I'm surprised INSERT
is so significantly slower on so many queries. DELETE
looks like it could be worthwhile to add an index to as well
I think https://github.com/argoproj/argo-workflows/pull/13286 will help, if this pr is merged, I think it is not necessary of above secondary index
Mmm the index would still make the periodic GC deletes faster
But if we remove that delete code, is there a situation that table records increase continuously due to periodic gc speed is slower then insert?
Technically, I think that's always been possible.
I mentioned above that we could add --archive-ttl-workers
to configure this. Maybe --archive-ttl-frequency
too.
Oh, this is for offloading though. We might be able to add something similar but I'm actually not familiar with the code for offloading's periodic GC if it works the same way as all other Controller GC (which generally have a configurable number of workers)
For this proposal, I see it was mentioned "considering using redis or s3 to save node status or archived workflows"
. Have we considered alternatives to SQL for offloading?
My company is facing similar issues as mentioned here, where we run super large-scale batch workflows which consume more than 50% of etcd under load. Having a performant alternative to etcd as Argo's stateful backing store would greatly benefit us. We also have an organizational aversion to relational databases, so I'm wondering if we've considered things like S3 or DynamoDB for offloading? These are both extremely performant over a network and might be easier to manage than a relational DB?
Summary
We enabled node status offload and workflows archiving, and we have observed some performance and stability issues.
Use Cases
When running a large cluster, kube-apiserver may be the performance bottleneck, we need offload and archive to improve kube-apiserver performance. currently, it needs some optimisation and improvement to make it stable in a production eviroment.
Some thoughts