Closed nicolasmure closed 6 years ago
Merging #451 into master will decrease coverage by
0.02%
. The diff coverage is82.78%
.
@@ Coverage Diff @@
## master #451 +/- ##
============================================
- Coverage 79.38% 79.36% -0.03%
- Complexity 837 868 +31
============================================
Files 56 56
Lines 1979 2069 +90
============================================
+ Hits 1571 1642 +71
- Misses 408 427 +19
Impacted Files | Coverage Ξ | Complexity Ξ | |
---|---|---|---|
src/Server/TextDocument.php | 75.18% <100%> (+0.18%) |
56 <0> (+1) |
:arrow_up: |
src/CompletionProvider.php | 94.23% <100%> (-0.18%) |
97 <0> (-5) |
|
src/Index/AbstractAggregateIndex.php | 92.15% <100%> (ΓΈ) |
26 <6> (ΓΈ) |
:arrow_down: |
src/Index/Index.php | 76.12% <79.2%> (+5.63%) |
57 <33> (+35) |
:arrow_up: |
Nice!
I have a hard time believing the memory usage though - exactly the same amount of bytes used, even though there is a new array that duplicates the whole index? I wonder whether we could just always use the namespace tree.
I've added an other commit to ceate an index representation of the global definitions (reduces the number of checks, depends on your dependencies (10 times faster on my project)).
I think that PHP does not make copies but store references to the definition objects.
As you can see here, I checked the memory consumption usage with memory_get_usage(true)
.
However, I have some other memory consumption metrics : on the current master branch :
on this branch :
I saw that you had a discussion with the maintainer of php-integrator, it could be a good idea to store the indexes in a sqlite DB too instead of in-memory and serialized files (will reduce the RAM consumption IMHO).
I switched to using this in vscode and it is significantly faster using 466MiB compared to 454MiB before. Before it meant this was unusable because autocompletes were over a second.
So I'll keep this branch installed for now so I can actually use the add-on to see if I like it.
Could you try always using the tree? I could imagine it is not significantly slower than the map
Hello @felixfbecker , I'm not sure to get what you're saying :thinking: Do you mean to have an index like this :
$this->definitions['Symfony\Component\HttpFoundation\Request'] => [
'Symfony\Component\HttpFoundation\Request->getLocale()' => $definition
]
and to get rid of the namespaceDefinitions
index ? It would result in a BC break because the serialized index cache will no longer be represented as before.
I mean to get rid of $definitions
and always use namespaceDefinitions
.
Changing the cache format is not a BC break, you can simply increase the cache format version.
Oh yeah I see, I'll try it then ;)
OK @felixfbecker , I added 3 commits to save memory, here are the consumptions (on my testing project):
Master branch | Before these 3 commits | After |
---|---|---|
518Mb | 529Mb | 522Mb |
I also yield indexed definitions to save memory instead of copying them in an array when using AbstractAggregateIndex
.
Does using Generators have an impact on the time? I remember from looking at them earlier, that they can add a surprising overhead.
@roblourens , here are some time records grabbed for the accessor (->
) operator :
With generators :
0.00016880035400391s
0.00014901161193848s
0.00014591217041016s
0.00020003318786621s
0.00025105476379395s
0.00034809112548828s
0.00015783309936523s
0.00014805793762207s
0.00018405914306641s
0.00019478797912598s
0.00015902519226074s
without :
0.0003349781036377s
0.00020289421081543s
0.00030088424682617s
0.00017595291137695s
0.00033283233642578s
0.00017690658569336s
0.0005340576171875s
0.00024986267089844s
0.0003349781036377s
0.00018405914306641s
0.00019097328186035s
These benchmarks were executed in the same way than the ones I posted on the description of this PR.
Generators should actually not decrease memory usage because PHP variables are copy-on-write. But it will reduce time because you save an iteration.
Iterator/Generator also can safe memory when you no longer need to build whole arrays but work item per item (stream like)
That's true if the source is an IO stream, but here the source is an already in-memory array.
@staabm yes and it is definitively the case here : we're just looping over it, using each item one by one (no need to know how many items are in the iterator, nor to access to the ones next to the one currently used).
@felixfbecker as the index used by the LanguageServer
is an AbstractAggregateIndex
, it makes sense IMO to use generators to 'stream' definitions from the aggregated indexes.
Master: 454MiB Before the memory saving commits here: 466MiB Now: 460 MiB
@robclancy yes, make sure that the cache has finished to be populated too before making any mesurements. The last commits have changed the cache version so the server has to rebuild all the cache items, so memory consumption will start from 0 to x until it has completed the project parsing.
You can see when it has finished to built by watching at CPU consumption and also by listing the cache directory by last modification date :
$ watch -n 0.5 ls -c ~/.phpls/
I'm getting my numbers from the output in vscode after the indexing is done.
Master: [Info - 11:55:37 pm] All 8950 PHP files parsed in 25 seconds. 454 MiB allocated.
So my numbers above are correct for this project.
I now have...
This branch: [Info - 1:02:48 am] All 8950 PHP files parsed in 17 seconds. 460 MiB allocated.
This is more inline with your numbers, I'm not sure what happened before as this has come from the same cache.
I've added a commit to correct the terminology (as you pointed out @felixfbecker ).
The serialized cache key has changed (namespaceDefinitions
=> fqnDefinitions
) so be sure to remove the previous cache files generated on this branch before using this lastest commit.
Also, I rebased on the master branch.
Hello @felixfbecker , any updates on this ? :)
Omg, that would be so awesome! Please! π
So I'm now using the single (and multi level) definition index array, but even if phpunit tests are passing, there is an error with what is proposed as completion items. I have this problem on nvim and vscode, they're both sending the same request to the server so it does not appear to be a client error.
The problem is that the resolved expression (ie the left hand side of the cursor) is not good. For instance, if you want to see what are the completion possibilities for the $definition
var of this function, you can add a new line at the beginning of this function, such as
$definition->get|
but you'll see that the proposed completion items are not the good ones (they don't belong to the $definition
object). Actually, in this exemple they belong to the $this
object.
I think there might be an issue with the node resolution (see here, which use the microsoft/tolerant-php-parser). The strange point is that I don't have this problem on the master branch :/
Any hints on this ?
@nicolasmure if you say there is a newline, why can't we strip it? What is different on master? If you push your WIP I can also take a look
@nicolasmure I cannot reproduce your issue:
Other than https://github.com/felixfbecker/php-language-server/pull/451#discussion_r151031720 I think this would be fine to merge
Hello @felixfbecker , Thank you for trying this out. This is indeed the same use case scenario that I did, but I only have the completion items for the current class :
This PR is rebased on master, I use the tolerant-php-parser v0.0.7, I also did a fresh composer install (removed the previous vendors, composer.* files).
BTW It would be nice to squash the commits before merging, when everyone will agree :) .
Would have loved to merge this today but I sometimes don't get any IntelliSense at all:
on master:
There must still be a bug in the logic
Hello, I still have the issue too. I created a PR on my fork where I pushed the debug code. This code will log all the messages received and sent between the client and the server, and will also log the type of node detected on which the cursor is.
Here's the output for the wrong $definition->get|
suggestions :
{"id":31,"method":"textDocument\/completion","params":{"textDocument":{"uri":"file:\/\/\/home\/nicolas\/www\/php-language-server\/src\/Index\/Index.php"},"position":{"line":165,"character":24}},"jsonrpc":"2.0"}
Expr class = Microsoft\PhpParser\Node\Expression\Variable, fullText = ' $this', uri = 'file:///home/nicolas/www/php-language-server/src/Index/Index.php'
Expr name = this
Content-Type: application/vscode-jsonrpc; charset=utf8
Content-Length: 7222
{"result":{"isIncomplete":true,"items":[{"label":"definitions","kind":10,"detail":"array","documentation":"An associative array that maps splitted fully qualified symbol names\nto definitions, eg :\n[\n 'Psr' => [\n '\\Log' => [\n '\\LoggerInterface' => [\n '' => $def1, \/\/ definition for 'Psr\\Log\\LoggerInterface' which is non-member\n '->log()' => $def2, \/\/ definition for 'Psr\\Log\\LoggerInterface->log()' which is a member definition\n ],\n ],\n ],\n]","sortText":null,"filterText":null,"insertText":null,"textEdit":null,"additionalTextEdits":null,"command":null,"data":null},{"label":"references","kind":10,"detail":"string[][]","documentation":"An associative array that maps fully qualified symbol names\nto arrays of document URIs that reference the symbol","sortText":null,"filterText":null,"insertText":null,"textEdit":null,"additionalTextEdits":null,"command":null,"data":null},{"label":"complete","kind":10,"detail":"bool","documentation":null,"sortText":null,"filterText":null,"insertText":null,"textEdit":null,"additionalTextEdits":null,"command":null,"data":null},{"label":"staticComplete","kind":10,"detail":"bool","documentation":null,"sortText":null,"filterText":null,"insertText":null,"textEdit":null,"additionalTextEdits":null,"command":null,"data":null},{"label":"setComplete","kind":2,"detail":"void","documentation":"Marks this index as complete","sortText":null,"filterText":null,"insertText":null,"textEdit":null,"additionalTextEdits":null,"command":null,"data":null},{"label":"setStaticComplete","kind":2,"detail":"void","documentation":"Marks this index as complete for static definitions and references","sortText":null,"filterText":null,"insertText":null,"textEdit":null,"additionalTextEdits":null,"command":null,"data":null},{"label":"isComplete","kind":2,"detail":"bool","documentation":"Returns true if this index is complete","sortText":null,"filterText":null,"insertText":null,"textEdit":null,"additionalTextEdits":null,"command":null,"data":null},{"label":"isStaticComplete","kind":2,"detail":"bool","documentation":"Returns true if this index is complete","sortText":null,"filterText":null,"insertText":null,"textEdit":null,"additionalTextEdits":null,"command":null,"data":null},{"label":"getDefinitions","kind":2,"detail":"\\Generator","documentation":"Returns a Generator providing an associative array [string => Definition]\nthat maps fully qualified symbol names to Definitions (global or not)","sortText":null,"filterText":null,"insertText":null,"textEdit":null,"additionalTextEdits":null,"command":null,"data":null},{"label":"getDescendantDefinitionsForFqn","kind":2,"detail":"\\Generator","documentation":"Returns a Generator that yields all the descendant Definitions of a given FQN","sortText":null,"filterText":null,"insertText":null,"textEdit":null,"additionalTextEdits":null,"command":null,"data":null},{"label":"getDefinition","kind":2,"detail":"\\LanguageServer\\Definition|null","documentation":"Returns the Definition object by a specific FQN","sortText":null,"filterText":null,"insertText":null,"textEdit":null,"additionalTextEdits":null,"command":null,"data":null},{"label":"setDefinition","kind":2,"detail":"void","documentation":"Registers a definition","sortText":null,"filterText":null,"insertText":null,"textEdit":null,"additionalTextEdits":null,"command":null,"data":null},{"label":"removeDefinition","kind":2,"detail":"void","documentation":"Unsets the Definition for a specific symbol\nand removes all references pointing to that symbol","sortText":null,"filterText":null,"insertText":null,"textEdit":null,"additionalTextEdits":null,"command":null,"data":null},{"label":"getReferenceUris","kind":2,"detail":"\\Generator","documentation":"Returns a Generator providing all URIs in this index that reference a symbol","sortText":null,"filterText":null,"insertText":null,"textEdit":null,"additionalTextEdits":null,"command":null,"data":null},{"label":"getReferences","kind":2,"detail":"string[][]","documentation":"For test use.\n\nReturns all references, keyed by fqn.","sortText":null,"filterText":null,"insertText":null,"textEdit":null,"additionalTextEdits":null,"command":null,"data":null},{"label":"addReferenceUri","kind":2,"detail":"void","documentation":"Adds a document URI as a referencee of a specific symbol","sortText":null,"filterText":null,"insertText":null,"textEdit":null,"additionalTextEdits":null,"command":null,"data":null},{"label":"removeReferenceUri","kind":2,"detail":"void","documentation":"Removes a document URI as the container for a specific symbol","sortText":null,"filterText":null,"insertText":null,"textEdit":null,"additionalTextEdits":null,"command":null,"data":null},{"label":"unserialize","kind":2,"detail":"void","documentation":null,"sortText":null,"filterText":null,"insertText":null,"textEdit":null,"additionalTextEdits":null,"command":null,"data":null},{"label":"serialize","kind":2,"detail":"string","documentation":null,"sortText":null,"filterText":null,"insertText":null,"textEdit":null,"additionalTextEdits":null,"command":null,"data":null},{"label":"yieldDefinitionsRecursively","kind":2,"detail":"\\Generator","documentation":"Returns a Generator that yields all the Definitions in the given $storage recursively.\n\nThe generator yields key => value pairs, e.g.\n`'Psr\\Log\\LoggerInterface->log()' => $definition`","sortText":null,"filterText":null,"insertText":null,"textEdit":null,"additionalTextEdits":null,"command":null,"data":null},{"label":"splitFqn","kind":2,"detail":"string[]","documentation":"Splits the given FQN into an array, eg :\n- `'Psr\\Log\\LoggerInterface->log'` will be `['Psr', '\\Log', '\\LoggerInterface', '->log()']`\n- `'\\Exception->getMessage()'` will be `['\\Exception', '->getMessage()']`\n- `'PHP_VERSION'` will be `['PHP_VERSION']`","sortText":null,"filterText":null,"insertText":null,"textEdit":null,"additionalTextEdits":null,"command":null,"data":null},{"label":"getIndexValue","kind":2,"detail":"array|\\LanguageServer\\Definition|null","documentation":"Return the values stored in this index under the given $parts array.\n\nIt can be an index node or a Definition if the $parts are precise\nenough. Returns null when nothing is found.","sortText":null,"filterText":null,"insertText":null,"textEdit":null,"additionalTextEdits":null,"command":null,"data":null},{"label":"indexDefinition","kind":2,"detail":"mixed","documentation":"Recursive function that stores the given Definition in the given $storage array represented\nas a tree matching the given $parts.","sortText":null,"filterText":null,"insertText":null,"textEdit":null,"additionalTextEdits":null,"command":null,"data":null},{"label":"removeIndexedDefinition","kind":2,"detail":"mixed","documentation":"Recursive function that removes the definition matching the given $parts from the given\n$storage array. The function also looks up recursively to remove the parents of the\ndefinition which no longer has children to avoid to let empty arrays in the index.","sortText":null,"filterText":null,"insertText":null,"textEdit":null,"additionalTextEdits":null,"command":null,"data":null}]},"id":31,"jsonrpc":"2.0"}
You can see that the $definition
var is detected as being an instance of LanguageServer\Index\Index
(located in src/Index/Index.php
) :
Expr class = Microsoft\PhpParser\Node\Expression\Variable, fullText = ' $this', uri = 'file:///home/nicolas/www/php-language-server/src/Index/Index.php'
Expr name = this
and the var is treated as if it is a $this
.
The big json response log is the completion suggestions.
I noticed that on the master branch, there is the following discussion between the client and the server when I'm typing $definition->get|
:
textDocument/didChange
=> servertextDocument/publishDiagnostics
=> clienttextDocument/completion
=> serverbut on this branch, the 2nd step is missing :/ I'm gonna investigate on this.
I fixed the bug in the fix definition removal
commit, the completion's now working as expected on this commit.
Then, I noticed that the completion for non member items was still slow (eg for array_ma|
) :
1.8270461559296s, 102184 checks, 1 items
so I tried to differentiate the index array to store the member and non member definitions separately to be able to find them quicker.
The result is (for array_ma|
) :
0.13462400436401s, 12392 checks, 2 items
as you can see it's way faster. However, there's a bug which duplicates the suggestions for these global defs :/ I still have some work to do on it.
See the debug PR for log indications.
Correct me if I'm wrong but isn't the only problem here the fact that the traversal method does a recursive iteration? When requesting completion, isn't it that we only always want the direct children of a given FQN? For example
$obj = new SomeNam| // I want only top-level symbols (classes/namespaces) to get suggested
$obj = new SomeNameSpace\SomeC // I want only the symbols in SomeNameSpace to get suggested
$obj->me| // I want only members of SomeNamespace\SomeClass to get suggested
Maybe I'm missing something, but why does the traversal function used for completion have to be recursive? For completion, don't we always just want to suggest the immediate children of a node in the symbol tree? I don't think we need two arrays.
Yes, I agree. The completion provides the closest items from the given fqn.
The index is interrogated by the completion provider with various fqns. As the definitions in the index are stored in a multi level array, this is why I use recursion.
With your exemple, when typing $obj->me|
, the index is requested to provide all the definitions of the SomeNamespace\SomeClass
. These definitions are stored like this way in the index :
[
'SomeNamespace' => [
'\SomeClass' => [
'' => $def // defs go here
],
],
]
this is why the recusive function is for. It uses recusion to get as deep as it can in the tree to find the definitions which are in the given fqn.
For the two arrays, it may look overkill, but it aims to speed the completion by requesting more specifically the definitions. If you look to the completionProvider on the master branch, you'll see that it requests all the definitions and then makes a check to test whether the def is member or not. This can be optimized because there are a huge amount of checks performed for only a small amount of definitions passing these checks. I store the defs in these 2 arrays to directly retrieve what I'm looking for instead of having to perform checks. Have a look to the benchs, you'll see that the check amount is reduced and the response time is also way reduced. It's even more blatant on large projects with many defs.
Yes, so the method can simply traverse into $definitions['SomeNamespace']['\SomeClass']
(which should have a negligible perf impact) and return those direct children (members) without having to check isMember
anymore (i.e. keeping the same perf improvements to CompletionProvider). That should not be slower than going into your second array, and it will work with Namespaces, not just members. I am reluctant to special-case members over any other level of the symbol tree.
In the fix definition removal
commit (ie before the two arrays), the index tree looks like this :
[
'SomeNamespace' => [
'\SomeClass' => [
'' => $def1, // the `SomeNamespace\SomeClass` class itself
'->method()' => $def2, // a method of this class
],
],
]
When calling getDescendantDefinitionsForFqn('SomeNamespace\SomeClass')
it will yield
'SomeNamespace\SomeClass' => $def1
'SomeNamespace\SomeClass->method()' => $def2
This is acceptable when we know what to look for (ie when we have some parts of a namespace), but in that case, there are too many places to look at (current namespace, global functions, global consts), and there's not always a way to get the defs in these places easily / efficiently. This is why I tried to reduce the number of places to look at by having the 2 arrays.
I agree that it's not pleasant to have these 2 arrays, maybe the completionProvider
requires some changes for this part, and I'm not really comfortable with it for the moment.
I might look insistent on this part but I'd like to keep the perfs improvements for the most cases :)
So if you have an idea on how we could get the definitions for that case efficiently with a simple index π
but in that case, there are too many places to look at (current namespace, global functions, global consts), and there's not always a way to get the defs in these places easily / efficiently.
So in that case, we have to look at the top namespace (first level of the array) and the current namespace (some level x in the array), so just 2 places to look at. That would be bad if it searched recursively, but if we make the method only search one level, it should be equal perf better as the second member array (actually better, because that doesn't help with the top namespace).
Yes, that's why the completionProvider
requires some changes IMO, it handles many places to look at in a single loop, whereas it should handle one place in one dedicated loop at a time by querying the index precisely.
To be continued so...
I might take a stab at this and if it turns out to be too complicated, we can use the two arrays for the meantime.
I changed it to only return direct children in my branch https://github.com/felixfbecker/php-language-server/tree/autocomplet-speedup
case | time for completion |
---|---|
master | 0.1210160255432s |
separate arrays (your branch) | 0.0550549030303s |
only direct children (my branch) | 0.0042061805725s |
looks good? :)
Yup, if tests pass then :+1: :) The next step would be to also work on the last part of the completion provider to query more specific definitions rather that all of them.
It would be good to have some methods in the index (as the getChildDefinitionsForFqn
), which would be more efficient for this case. As said earlier, it would be good to split this case in smaller parts :) .
I haven't tried your branch yet, just took a look at it for the moment. Thanks for the completion bench :+1:
@felixfbecker Did you already worked on your own branch?
I'd be interested in this.
Hello @felixfbecker :) Sorry for the late response, I've checked out your branch and it looks good to me :+1: Maybe we could try to merge it as it, as the PR is opened since a while. The completion improvement for global definitions could be resolved in a second time though. WDYT?
I'm on the fence whether to merge this in this state or not, and I am guilty of procrastinating the decision. You are right, there are some changes that need to happen at the bottom of CompletionProvider to also not O(n) iterate the whole index (note tests are failing on my branch atm). Which shouldn't be that hard, it might even be easy to just rewrite that part from scratch. If that is done, then this PR would be a very clean improvement for both UX and software architecture.
However: In it's current form I feel like this PR adds a lot of complexity, for improving the completion only for members. The CompletionProvider and DefinitionResolver are definitely the most complex parts of the whole server and low complexity is key for bugfreeness and a low barrier for contributions. We kind of went in circles here by starting with an extra array, then going to a tree, then going back to the separate array. My branch was an attempt to base on the state of the PR a few commits ago when it was still a tree, I just didn't have the time to finish it.
By no means I want to discredit your work, I highly respect the work you put into this and the improvement on the UX site is awesome.
Now if anyone wants to bring that pure-tree implementation over the finish line that would of course make the decision easier :)
Curious to hear your thoughts
Merry Christmas!
Thank you for your answer, I agree with you, there were a lot of decision changes in this PR. I tried to look at the bottom of the CompletionProvider yesterday but as you said it might require a rewrite from scratch, so I wondered if you agreed with the current state of the code. To be honest I'm using this branch for now on my machine, and it works good (completion is fast), so even if I'd like to help with the rewrite of the CompletionProvider, it's less crucial to me at the moment to spend time and energy on it. Maybe that now that there are more and more text editors that are adopting LSP, there will have some developers that will have time and motivation to solve this problem :) .
Anyway, thank you for your attention on this PR and for sharing your thoughts :+1: , and also for the maintenance work you're doing here ;)
To clarify, I only meant rewriting that very last section of CompletionProvider (that provides completion for global names).
Current stable is basically unusable because of how slow it is... I haven't been following the code here but shouldn't the end user experience be more important than how complicated the code is?
Since this isn't directly an add-on but installed by the add-on would it be possible to put this on a beta branch and then in the add-on have a "use beta language server" option?
I made the requested change in my branch.
Turns out I can't push commits to this PR, and I wouldn't really like to open a new PR because that would make it seem like I did all the hard work, not @nicolasmure.
Benchmarks before and after (benchmarks/completion.php
, as modified in my branch):
Before:
Getting completion
Time ($this->|): 0.0012650489807129s
90 completion items
Time (new|): 0.095192909240723s
1 completion items
After:
Getting completion
Time ($this->|): 0.0012638568878174s
90 completion items
Time (new|): 0.00032806396484375s
1 completion items
Hello,
This PR speeds up the autocomplete response time for the object access operator (
->
) and the static access operator (::
).On large projects (with many definitions indexed), these autocomplete features were long to respond because all the indexes were parsed. I added a new way to represent the indexed definition by their namespace (eg
Symfony\Component\HttpFoundation\Request
). So, when requesting for autocomplete definitions of a symbol, we can use this namespace index directly instead of parsing all the indexes.The new index key is stored in memory only and is not written in the cache file. It is rebuilt when reading the cache file, so it has no negative impacts on memory consumption.
Some metrics (using a symfony 3.1.10 project, see the first commits to figure out how they were recorded) :
Autocomplete requests on
$request->|
:before :
after :
Autocomplete requests on
Response::H|
:before :
after :
The time to find the items was reduced from ~2s to ~0.0002s, which is 10,000 times faster :rocket: . It also reduces a lot the CPU consumption because less checks are performed.
I know that there are some feature requests to avoid some directories indexation, but I think that this PR could be useful too to speed up response times ;)