julia-vscode / SymbolServer.jl

Other
23 stars 31 forks source link

Add a couple of small performance tweaks #177

Closed timholy closed 4 years ago

timholy commented 4 years ago

This adds a few optimizations that drop the time spent on the same workload described in the OP of #160 from 25.2s to 21.0s.

However, I think the "unsorted_names" is kind of a red herring, and I won't be remotely offended if you decide to just close this and go a different way. From a broader perspective, my sense is that split_module_names is doing something quite inefficient. It accounts for more than one-third of the total run time, and for comparison:

julia> using Images, Profile, ProfileView  # same workload as in #160
Gtk-Message: 06:47:57.888: Failed to load module "canberra-gtk-module"
Gtk-Message: 06:47:57.889: Failed to load module "canberra-gtk-module"

julia> using MethodAnalysis

julia> const allnames = Symbol[]
Symbol[]

julia> @time visit() do item
           if item isa Module
               append!(allnames, names(item, all=true, imported=true))
               return true
           end
           false
       end
  0.196790 seconds (607.61 k allocations: 38.474 MiB, 4.46% gc time)

I know split_module_names is doing a little more work than this, but if you profile the work seems essentially negligible: it's mostly from calling unsorted_names (which is why the sorting in names showed up on my profiling). I'm guessing it's doing it redundantly somehow. If you call it less, then I think the sorting is unlikely to be something that matters, which is why I say that I doubt you need much or any of the work in this PR.

davidanthoff commented 4 years ago

This is awesome! At some point we should do whatever magic you are doing here for the symbol server process for the entire language server process: that is actually the place where we have the most painful startup delays right now that really hurt the user experience (because things like block execute only start to work once the language server is up and running).

timholy commented 4 years ago

I'd be happy to help, if you can provide a command-line runnable set of operations that should be optimized.

A lot of this is just profiling, although I should add that most of my analyses have required something extra, a sense that certain operations should be faster than they are (based on past experience). That is to say, the implementation of the code itself is top-notch with very little to gain from micro-optimization (and the profile result says as much), but there's something about the algorithm that is getting in the way---which is not remotely surprising, given the recursive nature of the task at hand. So the profiling is really only useful to direct attention to particular parts, from which the less-obvious stuff begins. Happy to contribute as I can to other areas, especially if you have a sense that it's where the true bottleneck lies.

For the record my results in #160 seemed to suggest that you could be able to get another doubling or so in speed, if that PR didn't skip work that it shouldn't have. That's roughly in line with the contribution of split_module_names, so my guess is that it's the last frontier for optimization before SymbolServer qualifies as a true speed demon :smiling_imp:.

ZacLN commented 4 years ago

This is a brute force approach every step of the way so there are definitely improvements to be made there. The caching of all names for each module for each category (exported, + internal, + imported) would get some speed ups but a particulary slow part is determining whether a module is used by another (for all modules). Is there a way of checking this other than by comparing names available with those exported?

Edit - this PR looks good regardless of algorithmic changes

timholy commented 4 years ago

I wasn't entirely sure I understood what split_module_names was doing, so am I right in saying that it's trying to find the "origin" of each name in a module?

Would this help?

julia> module A
       module B
           export f
           f(x) = 1
       end
       using .B
       end
Main.A

julia> parentmodule(A.f)
Main.A.B
ZacLN commented 4 years ago

https://github.com/julia-vscode/SymbolServer.jl/pull/182/commits/76c02f248ecf8b4e6e1f0196ef932472f0570773

It's an attempt to capture symbols that are brought into a module via using somemodule: somevar and therefore don't get listed by names. There could well be a better way..

timholy commented 4 years ago

Interesting. Seems like an omission for names, frankly. https://github.com/JuliaLang/julia/issues/36529

timholy commented 4 years ago

Would you like that list to include "implicitly-usinged" names (e.g., from Base), or only "explicit-usinged" names (e.g., using somemodule: somevar)?

pfitzseb commented 4 years ago

Only the latter, I think. https://github.com/julia-vscode/SymbolServer.jl/pull/185 is an attempt to fix this here, but of course a fix in Base would be preferable :)

timholy commented 4 years ago

I think I mostly know what needs to be done, short of me failing to understand how using SomeModule: someobject is actually implemented. I've poked around unsuccessfully, so I've got a query out on slack. :crossed_fingers: