Open wphicks opened 1 year ago
Note, this issue replaces the original #3501 since there has been a fair amount of movement since then on this problem.
Some of these things, especially those things which require pairwise distances, will likely benefit from using the pre-compiled specializations that are already being built in RAFT. We've scraped through most things and updated them but there are still some things lingering, unfortunately, which might need to be updated (thinking single linkage, trustworthiness, silhouetter_score, hdbscan)
Just submitted #5061 to get the low-hanging fruit, but silhouette score remains our most significant bottleneck.
Summary
It would be useful to improve build time as much as possible for faster developer iteration and reduced CI resource consumption.
Current bottleneck files
The following are roughly in order of priority. Note that this does not correspond strictly to the longest build times (although it mostly does). Instead, it takes into account what can and can't be parallelized and ensures that we're tackling each dependent branch.