IntelliSense: Order suggestions using good, customizable and well-defined criteria

geekley commented 5 years ago

My intention with opening this issue is to have a place mostly for discussion on the topic of improving the order of Intellisense suggestions. TL;DR: See setting and conclusion at the end for some feature requests

Introduction

I believe an important thing in editor suggestions is to optimize their order, using principles like locality and usefulness of the suggestion to define its relevance. There are many criteria that could be used for that. I think it's beneficial to have those criteria described in a way that's transparent and well-defined (to allow discussing the best way to optimize suggestion orders) and also customizable (so that users can optimize it to their way of thinking).

So, with this in mind, I'd like to suggest a good system for ordering editor suggestions. My intention is just to provide the idea of having a logic for ordering that is customizable in ORDER BY semantics (like in SQL). And to have the criteria described here (the ones not already implemented) being considered, so that the relevance of suggestions can be further improved.

By "order by" semantics, I mean the system works with multiple levels of sorting, where each criterion is used as a tiebreaker for the previous one.

I know that some of these criteria are already present on VSCode, but there are some things which haven't been considered yet, so I'd like to describe how I think the thing as a whole could work first, and then describe the actual feature requests at the end.

Considerations

Firstly, I'm not considering filter relevance, which is the ordering criterion that sorts matches when the name is partially typed (so that exact prefix matches come before other substring matches, for example). That logic would likely be applied separately from this system (and VSCode already does a good job on this regard).

Also, there are two ways to trigger suggestions. The first is member suggestions (triggered with . after an expression, for a member token) which is NOT what I intend to describe here, although some things here might be useful for that too. Related: #50755

What I'm describing refers to general suggestions (triggered by pressing CtrlSpace for a "top-level" token). In this case the possibilities are too broad. Because of this, the suggestion mixes everything: members of the implicit this, local variables/parameters, classes, etc. It's difficult to know the user's intention, but there are some things we can consider.

For example, in a method body, it might be a good idea to show local variables/parameters before members, since you can already have members shown first when you type this.. You may also want to sort elements by proximity in code (distance from declaration to usage; near first), if they are otherwise equally likely to be chosen. And, most importantly, if the context receiving the value implies a type (like in an assignment or a parameter in a function call) then it would be great to have values with a compatible type being presented before the others.

Criteria for Ordering Suggestions

So, for general suggestions, I would do an "order by" in this order by default (of course, considering only whatever is applicable depending on context). The default orders here are just an example, they are debatable, of course.

Firstly, the ordering by primary category

Existing identifiers The most important category, so it goes first
Smart name suggestions, for new identifiers For example, after var or class; if there is context allowing smart suggestions
Word-based suggestions; when enabled
Native elements from the language (like keywords), if enabled Least important, since user likely knows these already (so it goes on middle-end part)
Elements customizable by the local user (like templates/constructs) Shown last so you can easily list them by pressing Up

Then, for identifiers, tiebreakers on this order:

By type compatibility with what is expected on where the cursor is That's how well the type or return type fits in the receiving type Only applicable for identifiers that have or are types Only applicable in places like assignment, argument, after return or new, etc
1. Exact type and subtype matches
2. Other implicitly castable matches
3. Explicitly castable matches (like supertypes, sorted by "distance" to expected type)
4. Elements that don't match, but still have or are types
5. Typeless elements, where this isn't applicable (they have nothing to do with types)
Then, by syntax explicitness
1. Elements that have to be referenced necessarily in the current syntax (like variables/parameters)
2. Elements implicitly included in the scope, which could have been accessed with a more explicit syntax instead (like members of the current class)
3. Elements implicitly included by extension syntax
4. Other "syntatic sugars"
Then, by declaration proximity (defined below)
Then, by semantic purpose You might want to have those grouped or further separated on each type of language element
1. Identifiers that have type and value (like local variables/parameters, function calls, etc)
2. Identifiers for types (like classes/interfaces/enums, type parameters)
3. Identifiers for macros and the like
4. Identifiers for scopes (like namespaces/imports and elements returned from require() in JS)
5. Other identifiers (like labels, etc)
Lastly, by name, alphabetically

Declaration proximity is defined in this order:

By proximity of source
1. Elements from the same file / compilation unit
2. Elements from the same project
3. Elements from dependency libraries
4. Elements from standard/native libraries (user might know these already)
Then, if on same file/unit, by proximity of scope where element is declared
1. Same scope first, then outer scopes, in order
2. Inner scopes go last, if applicable (for languages like JS)
Then, if on same scope and file, by proximity of the line in code where element is declared Only applicable where semantics imply order makes difference in usefulness of suggestion (e.g. vars)
1. If before the cursor, sort by line in decreasing order, so that the line right above is shown first
2. If after the cursor, is shown last, if applicable (for languages like JS)

So, with these citeria, you could have settings to control sorting with "order by" syntax:

// Primary order for IntelliSense suggestions
"editor.suggest.primaryOrder": [
  "identifiers", "smartNames", "words", "language", "user"
]
// Precedence of criteria used for ordering identifier suggestions (higher precedence first)
// It's possible to disable a criterion by removing it from the list
"editor.suggest.identifiersOrderBy": [
  "typeCompatibility", // types that directly fit first, non-matches and typeless last
  "syntaxExplicitness", // explicit syntax first, syntatic sugar last
  "declarationProximity", // same scope and near cursor first
  "semanticPurpose", // separates things like e.g. variables / classes / namespaces
  "name", // a-z A-Z _ (or whatever ordering is set by the language)
]

Conclusion

So, if this system were to be fully implemented, it would need these feature requests (based on what I think wasn't implemented yet)

[ ] Having suggestions use ORDER BY semantics, allowing better transparency of ordering criteria
- [ ] Settings to customize the order criteria for identifier suggestions
[ ] Having type compatibility criterion for suggestions (aka Smart Completion)
[ ] (if that makes sense): Having syntax explicitness criterion for suggestions
[ ] (maybe, if it's worth implementing eventually): Having smart name suggestions?

jrieken commented 5 years ago

Thanks for getting on this and kudos for setting the correct context, e.g focusing only on what you call "general suggestions".

I don't wanna engage (yet) in discussing the proposed solutions/ideas but just provide more context and what VS code can correctly do and what it cannot correct do (only guess).

When it comes to completions (like most other language APIs) VS code doesn't define/use programming language semantics but defines a completion domain model that describes the look and feel of the completions UI, e.g CompletionItem#label, #detail, or #kind are all used for rendering and there is no contract that e.g. label must appear in source; for example an extension is free to pick java.lang.String or String. That makes it hard to define something like (semantic) proximity sort on top of these UX recipes (btw we did make an attempt for proximity sort with editor.suggest.localityBonus)

What I am trying to say is this: Your suggestions make sense, a lot of sense, but given the current API some features must be implemented by extensions (using CompletionItem#sortText). That raises the question how we get the message to all extension authors so that more language languages feel equality smart?

geekley commented 5 years ago

Hmm I understand, so most things would have to be done by each extension. Makes sense.

In any case, I'm glad to have a starting point for discussion on this topic. If/when you're thinking about eventually making improvements on this area, then it may be useful to have at least the general idea of a model like this in mind (preferably with some more research first, since there is probably lots of factors that I haven't considered).

That raises the question how we get the message to all extension authors so that more language languages feel equality smart?

TL;DR: The editor simply doesn't care about most of the actual semantics, extensions still define the whole process - see the end.

Well, I don't know any details on how this works in VSCode, but the two things I can think that could make it feasible of would be:

to have interfaces that the extensions could optionally implement if they want to fit in this model (then the setting would only apply to those extensions that decide to support this "smart sorting"); and
having a model like this described as documention for extension devs.

Actually, thinking a little more about it, the only thing that the editor would need for sorting would be for the extension to provide sortable values for each criterion used in completion items (and if any criterion is not applicable, it can be simply ignored in sorting). The values don't even necessarily have to fit in a specific API type, they just have to be able to be sorted, and more or less follow the semantics from the documentation, or whatever is more suitable if there is any special case.

For example, (completionItem).sortingValues.semanticPurpose doesn't have to be an enum value defined in the API. One extension could implement it as simply numbers, while other could do it with an internally defined enum or even strings or some other custom type. Or, they might just not provide a value if it's not applicable. The only requirement would be to provide a value whose sorting makes sense according to docs (in this case, smaller values for types of identifier that are more useful in completion). So, the documentation would provide some guidelines, but it's still up to the extension to implement it with a proper semantic meaning.

Also, each criterion works independently, so if you add other criteria as possibilities, they would simply be ignored in sorting until the extension provides an implementation for them. And, of course, extensions can still implement their own sorting in the current method if they don't want to use this system.

So in summary, this "smart ordering" model defines only:

What are the sorting criteria (in the API)
How these criteria are generally expected to be used for sorting (in documentation)

And extensions/languages define in implementation:

Whether they are using "smart ordering" or their own custom ordering in completion
If they are using it: when/how each criterion applies; and the actual types of values used to achieve expected sorting for each criterion

Then all that the editor does is ordering values - it doesn't care what they mean or how they are obtained.

n8crwlr commented 5 years ago

Unfortunately I hang up on it again and again.

It's very confusing to get everything only in alphabetical order. Since the suggestions can be made different for everyone, I would just like to have an option for type sorting (which then sorted alphabetically) and for this type sorting we should be able to select / deselect all types from the list.

https://code.visualstudio.com/docs/editor/intellisense#_types-of-completions

Here a simple array would fulfill your service: types: [method, function, property,...] in a simple choice field of alphabetical or type sort.

geekley commented 5 years ago

I would just like to have an option for type sorting

That'd be nice, but there are some things to consider too, so I'll explain my thoughts a little more here...

Considerations: Customizing order in a criterion

Well, to fit this in the model that I'm suggesting, that would be possible with a setting like this:

"editor.suggest.semanticPurposeOrder": [
  "expressions", "types", "preprocessorExpressions", "scopes", // etc (others come last)
]
// or maybe with more specificity, to group those into categories (or not) as you prefer:
"editor.suggest.semanticPurposeOrder": [
  "variable|field|value|literal", "method|function", "class|interface|enum", "module", // etc
  // "A|B", "C" means A and B are in same order (they may be mixed together) and before C
  // others not mentioned here come last, as defined by the language
]

However, specifying this gets us closer to having to make assumptions about the specific language elements available, so this setting might be something that you'd have to set up per language... or you'd have to define these categories in some standard way, trying to cover most of the broad types of language elements. Like Johannes said, it's best to avoid assuming language semantics (since languages can be really different, for example DSLs, which might not have any of those concepts; even JSON doesn't have concepts like "variable" and "class", but it does have concept of types and values in a way).

I described the criterion as something generic like "semantic purpose" to try to minimize that specific-language-assumption and variance about what ordering and grouping are considered more useful, so I wanted to make it as broad as possible in that definition.

So, for me, vars|params|fields|expressions are all the same (things with a type and value) with regards to purpose, and classes|structs|enums are also the same (types). I expect that the things with a value would come first because most of the time that you want suggestions (when you're in a method body) they are the best suggestions when the editor has little context, and also because typeCompatibility ordering (smart completion) would only make sense for those two categories (which have/are types). Other categories would come last because they are less predictable to be useful as suggestion (you shouldn't "get suggested" a namespace, except when you've already started typing, and it's clear that you want it).

I avoided to specify a setting for that beacuse the languages' semantics are unknown and, again, languages might not even have those as language elements (although most do). So, without this setting, the language (extension) would be responsible for providing that order and grouping as it's more fit in that language.

But indeed, you might want to break it up in categories differently, and use different orders. And since most programming languages are alike, it's not impossible to have a generic order setting and still allow language overrides. So it makes sense to have a setting allowing custom order in this criterion, even with more specific (non-)grouping, but we'd have to take that into account.

In any case, at least the other criteria (compatibility, explicitness and proximity) are more straight-forward (and more language-agnostic), so there would be no need to customize their order, which is already set by definition. It's enough to be able to just disable their ordering or change their preference in the "order by" logic.

Anyways, these specifics are why I think this whole thing would need some more research.

geekley commented 5 years ago

Also, I forgot to mention: In the semantic purpose criterion, I didn't differentiate, for example, parameters, variables, fields/methods, etc when grouping them into the same "expressions" category, because "declaration proximity of scope" would most likely already take care of grouping those, in a language-agnostic way (because of where they are declared).

So, for example, in a method body, it would likely order them like this: local vars > params > members (fields/methods/inner classes)

However, n8crwlr, made me think of something ... what he said makes even more sense because "proximity of scope" won't separate types of members, and "proximity of line of code" ordering might make sense for vars (which have a sequence in them), but maybe it's not desired in members, because in their case "declared closer" might not necessarily mean "more likely useful". So there's that. But "proximity" is the part that VSCode already has implemented anyways.

So, I guess, in default "order by" order, "semantic purpose" would go after "proximity", and "type compatibility" would already in itself assume that identifiers that have/are types go before others.

Anyways, I'll update the model.

n8crwlr commented 5 years ago

// "A|B", "C"

Good

But, to get this in the near future, i would like to start with a simple experimental ordering. This can become optimized over time.

johndpope commented 4 years ago

Don't mean to hijack this thread but I have a suggestion to revisualise intellisense to use a force directed layout with nodes / animations / tweening /parallel nodes for completion. Please give this thought some space. https://github.com/MicrosoftDocs/intellicode/issues/123

I don't know where to start to prototype this - I cloned the repo - but it's not so simple to get to where this control is rendered / displayed or how I can swizzle in this d3 directed layout to replace stock standard control. Any pointers appreciated.

osddeitf commented 4 years ago

Your ideas make sense, I have another suggestion. TL;DR.

Personally, for long since I starting to write code, I not care about it much, but the more languages and libraries growing, the more complicated and frustrated for me.

As of today, after some shorta day I have learning Rust, with lots of reading over and over the book. I quite grasped the ideas of the language, so finally I have some confident, to the extent I begin writing some experiment, more specifically, a web server. As the flexibility of Rust trait (a.k.a interface in other languages), third-party packages take the most advantages of it. After more than a day struggling writing just a simple server on my own, I noticed that the implementation scattered all over the place, e.g.

close
fmt
inner
poll_close
poll_close
poll_flush
poll_flush
poll_next
poll_next
size_hint
size_hint
start_send
start_send
try_poll_next

Could you see something weird? I can ascertain that have NO typo. NO. The methods are duplicated. This results of trait implement for type that implement other trait, as I think it's unique to Rust. And this is just a simple version of what I faced in reality, as different library takes different approaches and efforts in making the interfaces as flexible as possible. My suggestion is that it's show what's trait the methods lying:

self.close
Display.fmt
self.inner
Stream.poll_close
Sink.poll_close
Stream.poll_flush
Sink.poll_flush
Stream.poll_next
Sink.poll_next
...

For better, sorts alphabetically as well:

self.close
self.inner
Display.fmt
Sink.poll_close
Sink.poll_flush
Sink.poll_next
Stream.poll_close
Stream.poll_flush
Stream.poll_next
...

I think this improvement not only make sense for Rust, but also for other languages and libraries as well. For example, Android have a whole bunches of methods for each classes, so sort by method name alphabetically are inadequate (of course you might not use Visual Studio Code for Android).

I had researched a wide range of IDEs out there, but there's seems not any IDEs with solutions well enough to have me satisfied. It's bugging me for a moment before I decided to make some move, trying to write myself.

I don't have much freetime to do this alone, and this topic likely not going to have much attentions for a feature request. So, I'd like to collaborate with you guys, who eager to have this feature, to implement it ourselves.

Let's make a futures with less frustrating coding experiences.

tjx666 commented 2 years ago

Is there any way to sort the suggest items by type? Would be better with a shortcut,I sometime only want to see all properties suggest items or method items, but now they are order by alpha. @jrieken @geekley

danielleiszen commented 2 years ago

I am happy to see this discussion. I am struggling with finding the proper field/property/method also. For example, when I need to navigate in generated code (grpc services and messages mainly) that has a lot of boilerplate (classes, methods... etc). Its hard to select the ones I am interested about.

So for the above reason I would vote for a local toolbar at the top of the IntelliSense window where I could switch the ordering or include and exclude certain item types. I think the actual supported cases are less important than having a flexible mechanism that is easy to customize each time the hints are displayed. So, a settings only solution would not be efficient in this case.

geekley commented 2 years ago

Is there any way to sort the suggest items by type?

@tjx666 Not that I know of, or at least not as a generic setting or feature that works for all language extensions.

I think the only way to improve suggestions order currently is with something like Intellicode (which is AI-based), but that's out of scope here (it just improves it "smartly" with no customization or simple/clear sorting rules). I haven't used it for quite a while, as I'm not sure whether / how much data is sent to MS servers in order to provide it (the info I found seems to apply only to VS, not VSCode).

Honestly, if member completions would just give less priority to inherited entries and much less to deprecated entries (and show them in ~~strikethrough~~), that would already help a lot.

VSCodeTriageBot commented 1 year ago

We closed this issue because we don't plan to address it in the foreseeable future. If you disagree and feel that this issue is crucial: we are happy to listen and to reconsider.

If you wonder what we are up to, please see our roadmap and issue reporting guidelines.

Thanks for your understanding, and happy coding!

gaoqiangks commented 1 year ago

It will be pleasant to see this feature.

geekley commented 1 year ago

While this request is not going to be implemented, I hope my analysis serves as a sort of guideline for any programming language extension developers who are looking to have good ordering of completions. Even if sorting is not made customizable, just using the logic and default order proposed here should be enough to provide a better experience for your extension users.

So the info at the top is relevant for anyone wanting to implement code completion in extensions, i.e.: vscode.CompletionItemProvider.provideCompletionItems or LSP's "capabilities"."completionProvider". I hope search engines help them arrive here. Or, even better, I wish the docs there would point developers to some of these guidelines, so they can make better completions. Specifically, to answer:

How are completion items categorized, and where should these groups be placed?
What criteria should be used for sorting code completions of identifiers?
Which criteria are more important, and thus should have higher priority?
What should cause higher or lower ranks in a criterion? How to determine this?

For anyone graduating in Computer Science, and interested in this topic: this could honestly become a research paper.

ThaJay commented 1 year ago

I just want imports local to the project to be suggested before (above) imports from library code. What can I do to achieve this? It sounds simple enough.

rossirpaulo commented 8 months ago

I just want imports local to the project to be suggested before (above) imports from library code. What can I do to achieve this? It sounds simple enough.

Couldn't figure it out either. I noticed that your project structure plays a role, though. So, if you are in Next.js, working with a /src/components folder, a node_modules folder at the root will rank higher up. I did minimal testing and then couldn't find a solution.

Number-3434 commented 8 months ago

I would find this very useful, as currently it is very hard to find symbols declared by the actual type, and not inherited / extensions.

BloodyRain2k commented 5 months ago

I would find this very useful to put suggestions for constants at the bottom, as I very rarely need those. But when I do, it's nice to be able to just scroll through them because most of the time, seeing them makes me remember which one I'm actually looking for. But this is in both cases annoying when they're completely mixed with everything else...

alecthomas commented 4 months ago

I recently filed https://github.com/golang/go/issues/66523 against the Go LSP project, and the team there has responded showing that the LSP returns results in a sane order (ie. local symbols first, then external library code), but VSCode subsequently overrides that ordering completely. If, as @geekley suggests, there are ways for an extension to control ordering in the completion dialogue, what are those mechanisms? How can the Go extension improve ordering?

microsoft / vscode