Open purpleidea opened 9 months ago
@frebib If you could push (even a non-compiling) POC of your autoedges branch, that would be awesome!
A bit more data to be sure
main: new graph took: 913.653µs
main: auto edges took: 9.273807153s
main: auto grouping took: 28.690819ms
main: send/recv building took: 566ns
main: new graph took: 779.255µs
main: auto edges took: 4.03670168s
main: auto grouping took: 37.682101ms
main: send/recv building took: 121.017µs
main: new graph took: 1.157479ms
main: auto edges took: 3.794132165s
main: auto grouping took: 49.732836ms
main: send/recv building took: 95.921µs
main: new graph took: 900.937µs
main: auto edges took: 7.206085s
main: auto grouping took: 25.508671ms
main: send/recv building took: 489ns
main: new graph took: 794.224µs
main: auto edges took: 4.313729756s
main: auto grouping took: 47.970533ms
main: send/recv building took: 207.62µs
main: new graph took: 884.49µs
main: auto edges took: 7.585529786s
main: auto grouping took: 24.327938ms
main: send/recv building took: 72.741µs
main: new graph took: 774.157µs
main: auto edges took: 2.827380129s
main: auto grouping took: 28.303023ms
main: send/recv building took: 85.246µs
main: new graph took: 746.841µs
main: auto edges took: 2.775868117s
main: auto grouping took: 33.11291ms
main: send/recv building took: 104.875µs
main: new graph took: 796.445µs
main: auto edges took: 2.71556122s
main: auto grouping took: 24.03827ms
main: send/recv building took: 106.414µs
main: new graph took: 1.217452ms
main: auto edges took: 2.908416104s
main: auto grouping took: 61.175916ms
main: send/recv building took: 92.328µs
main: new graph took: 807.894µs
main: auto edges took: 3.222089261s
main: auto grouping took: 40.032629ms
main: send/recv building took: 106.49µs
main: new graph took: 986.963µs
main: auto edges took: 3.538425263s
main: auto grouping took: 30.660849ms
main: send/recv building took: 99.74µs
I haven't worked on this since the cfgmgmtcamp hacking day and I only just made it compile. No idea if it works, but have at it: https://github.com/frebib/mgmt/commits/frebib/autoedge/
Description
Autoedges is not as fast as it could be. It's not blocking us anywhere, but we should think about eventually optimizing this to lower graph sync latency.
Analysis
For each new resource graph (in the stream of resource graphs) that is generated, we must compute the automatic edges. I ran this on my provisioning code base. (It will be published shortly.) Currently this is a bit slow. I added some light profiling to the graph generation steps, and I got the following:
Note the units. What surprised me was that I expected autogrouping to be the slower operation, but in fact it is not. This is obviously highly dependent on the specific
mcl
code being run, but it gives us a good rough idea.Improvements
There are both algorithmic improvements and plumbing improvements that we could do.
Algorithmic
@frebib has some initial POC work on making the algorithm n*log(n) IIRC. I think it's currently n^2.
Plumbing
There are likely some efficiency wins if we look at the code. This needs proper profiling (even flamegraphs?) to know where we're actually slow.
We could also consider adding a static API at resource creation, that defines which resource kind pairs are even legally able to create auto edges. For example, if we know that
svc
can only ever autoedge withfile
andpkg
, this reduces our search space drastically. Keep in mind that either resource side needs to be able to opt-in.Concurrency
Graph generation happens in parallel to execution, so this isn't blocking execution. We could double-check that it's nominally concurrent, and we could make sure we pause only after the graph is built, but this is very low-priority. Secondly, I wouldn't be surprised if we could actually implement a concurrent version of the autoedges algorithm that uses more CPU's, but I don't think this is an important optimization to do at this time.
Generics
@frebib believes the algorithmic implementation would be a lot faster with generics. If we can avoid this, that should be a goal, but it's an important thing to discuss.