Implement token-based language service

jeremy447 commented 4 years ago

Go to Definition and Find All Reference are not 100% reliable in VsCode. The problem occur with VsCode itself for the langages it handle and with extensions for the others langages. VsCode (or the extensions) sometimes doesn't find any definition, or sometimes they don't find all references. That happen because the way of finding the definition and references is based on a certain level of "understanding" of the code. So when the code do something that make it hard to find definition and references you end up with something broken.

I don't have any exemple for a language that VsCode handle but that not hard to find. I have one with Prestashop 1.6.1 source code with the extension Intelephense. If you try to go to definition of $product->addCombinationEntity in controllers/admin/AdminImportController.php it doesn't find any definition and if you try to find all references of $product->addCombinationEntity it doesn't find anything. It fail besause the way that Prestashop name and load Class (https://github.com/bmewburn/vscode-intelephense/issues/700).

On the other hand Go to definition and Find all references work 100% of the time in Sublime text and a lot of developers would love to have something similar in VsCode. It seem that Sublime just make a text search to show definition and reference. It's very basic since it show you everything it find even in other langage (ex: js whereas you are in a php file...) but it find everything and it's very reliable. If we can have the same thing in VsCode maybe even better if you limit the search to relevant code (same langage) it will be awesome. Maybe as a workspace option to activate only on project that need it.

Text search is far from perfect but it's far far better than having nothing or, worse, missing references for example when you need to modify all of them. I think a lot of developers rely on Go to definition and Find all reference to quickly navigate through code. A lot of developers use Sublime and love it whereas it have only a very basic text search based Go to definition and Find all reference but that work 100% of the time. Again, an optional text search in complement will make VsCode on par with Sublime for these features, and even better if you manage to limit the result to relevant code, and will fulfill the need of a lot of developers.

We need these features to be 100% reliable to use them professionally so, in my opinion, it's almost mandatory to include an optional text search in Go to definition and Find all references. Thanks !

vscodebot[bot] commented 4 years ago

(Experimental duplicate detection) Thanks for submitting this issue. Please also check if it is already covered by an existing one, like:

bmewburn commented 4 years ago

I guess the question here is should go to definition go to the definition or what might be the definition? Same for find all references. IMO if something is uncertain a result shouldn't be returned because a user could act on incorrect information.

jeremy447 commented 4 years ago

Well, the way it work now is uncertain so it shouldn't be returned either if you think that way. I think it should be an option with a short explanation. That way only those who want this behavior will activate it and they will be informed of the way it work.

jrieken commented 4 years ago

Partly agreeing with @bmewburn but I can also understand the desire and usefulness. This is similar to word based suggestions which some people hate and some like. I believe Sublime uses token information for some level of language smartness. Technically, all that's required is an extension that implements this but I am leaving this open for now.

jeremy447 commented 4 years ago

It would be awesome if the extension was made by Microsoft. It's a critical functionality.

Tetralux commented 4 years ago

As linked in #88771, I'd suggest that entity.name.* scopes are made the default way to find symbols - and then if an extension for a particular language wants to do something smarter, they could have a way to disable or override it. Only it seems quite silly to not use the information you already have, and also require use of the API in order to provide symbols, which is what I assume you have to do at the moment.

ghost commented 2 years ago

Problem - not all entity.name.* scopes are definitions. For example meta.function-call entity.name.function. We would need further specificity to get past this problem.

There's also the issue of property syntax and how to match properties, what scope to place them in etc.

I am using vscode-textmate to make language providers for MATLAB but this kind of feature needs language configuration and fine-tuning.

In theory my MATLAB tokeniser engine should be able to spit out all this data based off simple configuration. I'll probably try put it in vscode-extension-samples if staff are okay with that

ghost commented 2 years ago

At Gimly/vscode-matlab#133 I have developed a full set of language providers based purely on Textmate. I plan to extract this, make it configurable and publish it as an NPM module.

ghost commented 2 years ago

Right, https://github.com/SNDST00M/vscode-textmate-languageservice/tree/v0.1.0 is now published to NPM.

@jrieken @dbaeumer @bpasero @eamodio @egamma I'd like to transfer this package to Microsoft please 🙏🏾

jrieken commented 2 years ago

fyi - we are actively exploring this in https://github.com/microsoft/vscode-anycode. It is on the marketplace and also comes as built-in extension on github.dev

The approach taken is slightly different - instead of textmate grammars we are using the tree-sitter library. That is much, much faster and produces good syntax trees but the downside is that not all languages are supported. Things might still change but can be tried already (latest insiders required)

ghost commented 2 years ago

Nice, that's a upgrade of sorts

ghost commented 2 years ago

Folding is one feature I'd really like as well as highlighting of entities and/or parameters

phil294 commented 1 year ago

I would see a "token-based language service", such as suggested in this issue, as a natural enhancement to the existing word-based suggestions of VSCode. The equivalents are quite apparent, going through the most important LSP features:

autocomplete -> word-based suggestions (already exists, but only for current file)
go to definition/implementation, find references -> find in files
document highlight -> text-based highlight (already exists)
hover -> none
folding range -> indentation based folding or #StartRegion/#EndRegion (already exists)
document symbols -> there's nothing like it, this can indeed only nicely be achieved with something like tree-sitter
diagnostics -> if any, it's spell checker extensions
code action -> none
formatting -> none, apart from the "Convert indentation to Tabs"/"Spaces" commands
rename -> select all occurrences of find match

Given how fragmented and partially-already-solved these are, I don't think this should be handled by an extension. VSCode is already partially there, it's mostly the bold ones that are still missing for complete IntelliSense around non-supported languages, or, in other words, to get to the same level of capabilities that e.g. Sublime or WebStorm offer.

If you were to really realize these features in an LSP server, you'd probably still need to index all available files and imitate an alternative "Find in Files" functionality (haven't researched this), replicate an "Exclude files" setting and so on. The interface would also be somewhat questionable - like bmewburn said, these suggestions would show up as if they were accurate, when the results should instead be marked as inaccurate, something that cannot be realized by an extension if it is also supposed to support with any text and language.

vscode-anycode is a clever approach, but does it really solve this very issue, as it is "only" supports 8 languages and is not usable as a generic language support extension, e.g. for more exotic languages? And it also does not seem to be targeted at running alongside other extensions for the same language, it's more of an all-or-nothing approach. Maybe I'm missing something.

Sorry, these were just some random thoughts of mine. I mainly think that GoTo responses like Cannot find definition for XY are plainly unsatisfying, bad DX and a missed opportunity. Optimally, they should almost never occur, regardless the language.

zm-cttae-archive commented 1 year ago

RE speed: I rewrote the vscode-textmate-languageservice library. It now works in the browser and is nearing saturation for perf optimisations.

Because of the expense of tokenizing a full document, I built a cache of promises and used hash based validator to fetch from it. I also made it a base class for all my services

microsoft / vscode

Implement token-based language service #82024