unisonweb / unison

A friendly programming language from the future
https://unison-lang.org
Other
5.81k stars 271 forks source link

slow to fetch term via API (only) in large codebase #1828

Open atacratic opened 3 years ago

atacratic commented 3 years ago

I click to fetch a term via the API (so via the codebase-ui in fact) and my browser takes ~13 seconds to show me the term, with ucm using max CPU during that time.

Fetching the same term (by name) using view from ucm itself takes only ~0.5 seconds.

Just terms, not types.

Presumably a function of my codebase, which on disk is using 2.9GB, almost all in the form of 170k files under v1/paths. FWIW when I do ls in ucm at the root there are 12k definitions.

versions: unison:trunk@2b83c9 codebaseui:main@e627cb

pchiusano commented 3 years ago

@atacratic thanks for the report. Just to clarify, you are saying the slowness only happens when fetching a term, but fetching a type is still relatively speedy? If so that's probably a good clue.

Also curious if the slowness is just some of the terms, or all of them.

atacratic commented 3 years ago

Correct!

And seems to be all of the terms.

aryairani commented 3 years ago

Not sure if I'm seeing the exact same problem as @atacratic, but I seem to be seeing the same problem as @atacratic: a short lag when loading types, and a long lag on loading terms; all of them afaict. https://user-images.githubusercontent.com/538571/109903334-a82f1400-7c69-11eb-8180-4b44168b65b9.mov

aryairani commented 3 years ago

However, I'm seeing 2-3% CPU usage, not 100%, with my lag. Also >9GB memory usage. :)

runarorama commented 3 years ago

I did some profiling of this, and I am seeing that the server version of this is much slower than the UCM version. However, all of the difference is taking place inside Servant. I suspect that turning the definitions into JSON is what's taking the bulk of the time.

pchiusano commented 3 years ago

@runarorama Interesting... some random ideas I thought of that might help -

That's all I got.

runarorama commented 3 years ago

Yeah, the server always calls branchFromFiles since it doesn't know up front which branch hash is going to get requested. More specifically, it's called by getBranchForHash, which we need to call.

pchiusano commented 3 years ago

Right, that makes sense. It's fine if that function gets called once for each subnamespace for the branch in question (each level of the tree involves a separate load from disk), but if after you have your Branch m, if it gets called again after that, that implies it's looking through history.

runarorama commented 3 years ago

no, it's just called once. It's called never for the UCM version though, since that holds on to the root and subscribes to root changes.