tompng commented 1 year ago

Background

Current RegexpCompletor has many limitations. It can't complete chained methods. Many wrong completions. For better completion, we need to parse the incomplete input and perform type analysis. We need to use RBS and an error tolerant parser like Prism.

PullRequest

WIP pull request is opend in #708. Fallbacks to RegexpCompletor if Prism is not available.

Performance

RBS loading performance depends on rbs_collection.yml and project's rbs size under sig/ dir. It takes 0.168721s when there is no yml and sig. Completion performane depends on code size editing in irb. For example below, it only analyze "n.times do\n _1." with binding that knows the actual value of n.

irb(main):123> n.times do
irb(main):124>   _1.█

Analyzing 1161 lines takes 0.072726s. For comparision, coloring code takes 0.057202s

code = File.read 'lib/irb/type_completion/type_analyzer.rb'
# Force to analyze every single nodes
class IRB::TypeCompletion::TypeAnalyzer::DigTarget; def dig?(node)=true; end
measure
IRB::TypeCompletion::Completor.new.completion_candidates code, '.', '', bind: binding;
# => processing time: 0.072726s
IRB::Color.colorize_code(code);
# => processing time: 0.057202s

Accuracy

Comparision with current RegexpCompletor

As far as I know, the new completor can complete every case that RegexpCompletor can complete, even if RBS is not available.

# Completion without RBS that RegexpCompletor can't complete

def foo(n = 0)
  a = n; a = 1.0 if b; a.█ # Integer and Float methods
end

class User < ApplicationRecord
  has_█ # has_many, has_one
  def bar
    sa█ # save, save!
  end
end

module IRB
  A█ # IRB.constants and Object.constants
  @@a█ # IRB's class variable names
end

With RBS

When RBS is available, it can complete more.

# Completion with RBS

'a'.bytes.shuffle.join.encoding.█ # Encoding methods

10.times.map(&:to_s).map { _1.to_sym }.map(&:█) # Symbol methods

object.instance_eval do
  @a█ # object's instance method names
  self.█ # object's methods
end

%w[a b c].each_with_index do
  s = [:s] * _1
  a = [:a] * _2
  s[0].█ # String methods
  a[0].█ # Symbol methods
end

In Rails apps

Adding rbs_collection.yaml and sig/*.rbs to the Rails project will significantly improve accuracy. Article.first.author.email.und can complete String#underscore if types are defined in project's sig directory. Providing good completion experience in rails console will be a motivation for users to write .rbs in their project.

Other type analysis libraries

This completor is a REPL-first type analyzer. It uses few information about the project codebase and many dynamic information from binding. Other type analyzer uses lots of project's code and no binding information. I think it's hard to use another type analyzer and integrate with binding information. Maybe this could change in the future.

Where to have this code, IRB or bundled/default gem?

I want to have this code inside IRB's repository.

I want to enable it by default, so this code should be inside IRB or default/bundled gem, not normal gem that needs installation. Want to completely replace RegexpCompletor in the future, I think default completor can be controlled inside IRB. I think the possible user of REPL-first completor is only IRB. It's easy to decide and have it inside IRB. As a bundled or default gem, it should be discussed in bugs.ruby-lang.org. My opinion is, let's include it, get feedbacks, consider splitting it or not later.

st0012 commented 1 year ago

To help us better test this feature, what do you think about adding rbs files to IRB and add a CI step to typecheck it? Also, do you know if there are some open source projects that use RBS?

tompng commented 1 year ago

CI step to typecheck

Looks good. I don't know if its easy or hard, but we should try it.

Lrama https://github.com/ruby/lrama started to use RBS and Steep List of gems using RBS as a library https://rubygems.org/gems/rbs/reverse_dependencies

st0012 commented 1 year ago

Firstly, thank you for the considerable effort you've put into this feature ❤️
Utilizing the power of typing in IRB is indeed a significant leap forward not just for IRB, but for the language as well.

General Feedback

I propose we merge this feature before Ruby 3.3.
However, I am against enabling it by default in 3.3.

The reason for the latter is due to the critical nature of this feature in the users' workflow. If anything goes wrong, it could heavily impact our users and potentially erode their trust in IRB. The introduction of the current autocompletion with Ruby 3.1, which was enabled by default, led to numerous issues as seen in #445. Although I believe this feature will be better tested and reviewed with our team of IRB maintainers, I'd like to avoid a similar situation as much as possible.

Public Rollout (Enabling by Default)

Personally, I think we can aim for the end of next year. However, before that, we need to:

Ensure all maintainers understand how this new feature works from the inside out. Some technical documentation, such as diagrams showing how components interact, would be incredibly helpful.
Encourage users to manually enable it and understand their reasons if they choose not to.
Address any issues that arise as adoption grows.

Hopefully by then we can drop Ruby 2.7 and add Prism as a dependency too.

Questions about Maintenance and Future Development

With the completion results now depending on a variety of factors, such as users' type signatures, community-contributed signatures, and the type analyzer itself, we need to consider:

How can we assist users in reporting issues and effectively investigate them?
Given that the type-analyzing mechanism will differ from Steep’s, how do we determine if a result is legitimate or not?
How do we evaluate if the feature is worth the extra maintenance effort?
Is there a contingency plan to replace this feature if our answer to the above question is negative?
Since the debug gem also uses IRB’s completion, do you plan to collaborate with its maintainer Koichi to adopt this feature there?

Additional Thoughts/Concerns

If we incorporate this directly into IRB, we essentially become maintainers of the type analyzers too, which is a substantial task. Relying on a single maintainer for such a significant feature is not ideal, and I'd like to avoid it. However, I'm unsure how we can effectively increase our bandwidth.
Providing good completion experience in rails console will be a motivation for users to write .rbs in their project.

In my opinion, the current completion experience is already satisfactory, largely thanks to your improvements. I believe this feature will primarily enhance the experience for users who have already adopted RBS, rather than encouraging new adoption.

tompng commented 1 year ago

How can we assist users in reporting issues and effectively investigate them?

Add Completion section to irb_info command. This is implemented in #708

Completion: Tab Complete, RegexpCompletor
Completion: Autocomplete, RegexpCompletor
Completion: Autocomplete, TypeCompletion::Completor(Prism: 0.16.0, RBS: loading)
Completion: Autocomplete, TypeCompletion::Completor(Prism: 0.16.0, RBS: 3.3.0.pre.2)
Completion: Autocomplete, TypeCompletion::Completor(Prism: 0.16.0, RBS: #<LoadError: cannot load such file -- rbs>)
Completion: Autocomplete, TypeCompletion::Completor(Prism: 0.16.0, RBS: #<RBS::Collection::Config::CollectionNotAvailable:"rbs collection is not initialized.\nRun `rbs collection install` to install RBSs from collection.\n">)

Given that the type-analyzing mechanism will differ from Steep’s, how do we determine if a result is legitimate or not?

Type analyzing is simple (less functional) than Steep. We should determine fixing the difference is possible in the simplified specification or not. (aims to avoid extra maintenance cost)

How do we evaluate if the feature is worth the extra maintenance effort?

Measure how many user enabling and re-disabling it. Any idea of comparing with extra maintenace effort?

Is there a contingency plan to replace this feature if our answer to the above question is negative?

We can simplify some of the analysis that causing maintenance problem. example:

Remove type analyzing, just parse by Prism and autocomplete literals and variable. (slightly better than RegexpCompletor)
Remove tracking local variable change like a=1; a+=0.5; a="" if condition; a.string_or_float_or_int_method
Remove supporting generic types, Array[Integer] to just Array[untyped]

Since the debug gem also uses IRB’s completion, do you plan to collaborate with its maintainer Koichi to adopt this feature there?

Yes, we should. Not thinking how to yet. Maybe we can add an API to get appropriate completor

st0012 commented 1 year ago

Add Completion section to irb_info command.

That's helpful and I did notice that. But what I'm asking is in a broader sense, like:

What will a good repro for these issues look like? Is it possible to have a single file repro script template that has both rbs and Ruby code?
Should we maybe create a separate issue template so we can ask type-specific questions without bloating other IRB issues' reporting?

Type analyzing is simple (less functional) than Steep.

It will be great to have a few examples demonstrating this in the readme so we set the right expectation to other maintainers and our users.

I think there could also be cases where it's more powerful than Steep given we have runtime info? Examples of them should be mentioned too.

Measure how many user enabling and re-disabling it. Any idea of comparing with extra maintenace effort?

I think we have limited options to measure the benefit it brings to the community. Code searching may not be too helpful because if people enable this at project-level .irbrc, we generally won't see it. Perhaps the most effective way is to host polls on social media platforms, maybe around next RubyKaigi?

Wrt the extra effort, I think the we can get a sense of it by counting PRs associated with these components.

I haven't think of a way to compare the two though. It's likely that we won't be able to make any clear rule on this, but it's important that we keep this in mind.

We can simplify some of the analysis that causing maintenance problem. example:

Thank you for these examples 👍

Not thinking how to yet. Maybe we can add an API to get appropriate completor

That integration could require some extra work because currently its console can't talk to individual threads directly.

tompng commented 1 year ago

For reporting issues, single file repro is hard because of the setup to create binding(self object, module nesting, variables) and environment(sig, rbs_collection.yaml). I think issue template works. These information will help us.

irb_info
Version of RBS and Prism
Content of sig/*.rbs, rbs_collection.yaml and rbs_collection.lock.yaml
IRB::TypeCompletion::Types.rbs_load_error and it's backtrace
IRB::TypeCompletion::Completor.last_completion_error and it's backtrace (the method is not implemented yet)
What you typed to IRB

have a few examples demonstrating this in the readme

host polls on social media platforms

Looks good :+1:

st0012 commented 1 year ago

@tompng Can you start another PR, maybe on top of #708, to draft the documentation for this feature?

I think we need to cover these:

How to activate it and verify the activation
Where it collects signatures from
- This is important because users can define signatures outside of sig/ and configure Steep to locate them, which won't work for this feature atm
Potential impact on performance
Cases where it's more/less accurate than Steep

ima1zumi commented 1 year ago

I read part of IRB#708 and this discussion. Thanks for making the Issue.

My thoughts

I agree with the inclusion of Type Completor. Regexp Completor has a lot of wrong guesses, runs slow and is not user friendly, I think Type Completor will be a useful feature for users.
I have some concerns about whether I can maintain it myself. (I'm reading the PR, but it looks difficult.)
I agree with turning it off by default in 3.3. I would like feedback from users.

hasumikin commented 1 year ago

I vaguely think that it would be helpful to have a debugging feature that displays something like a backtrace of why the current list of autocomplete candidates was generated.

st0012 commented 1 year ago

In Sorbet, there's a T.reveal_type helper:

I think if we have something like irb_reveal_type(my_var) and simply prints the type of my_var, it will be very helpful too.

tompng commented 1 year ago

708 can do something like this:

irb(main):001>  a, b = 1, 'a'
=> [1, "a"]

irb(main):002> IRB::TypeCompletion::Completor.new.analyze('[a, b].map(&:a', binding)
=> [:call, Integer | String, "a", false]

irb(main):003> res = IRB::TypeCompletion::Completor.new.analyze('a.times {|aa| a', binding)
=> [:lvar_or_method, "a", #<IRB::TypeCompletion::Scope:0x0000000107740dd0>]

irb(main):004> res[2].local_variables
=> ["aa", "res", "a", "b", "_"]

irb(main):005> res[2].self_type
=> Object

I think adding a method or command that takes (code_to_complete, binding) and shows formatted output of these can help.

st0012 commented 1 year ago

Now asking to change the current implementation, but I think a benefit of extracting the completor implementation to a gem is that it will give us better dependency control and simpler implementation:

With the current approach, we need to passively check the version of prism installed by users and can't really bind upper bound restriction to rbs and prism to avoid breaking changes. This means that we always need to support the latest rbs and prism change and having ugly version branch in the implementation if we want to maintain backward compatibility.

If type_completion is a separate gem, we can add rbs and prism as dependencies like what we usually do and avoid the above problem. For example:

type_completion v1.1.0 - requires prism >= 0.17, <= 0.18
type_completion v1.2.0 - requires prism >= 0.18, <= 0.19

And in IRB, we simply ask users to have type_completion installed and don't care about prism or rbs changes.

tompng commented 12 months ago

I see. That's right, it makes sense. I think we need to decide name and the api.

Naming

As a separate gem, the name type_completion is too generic because it's a completion only for binding-provided environment, not for editors or other. What do you think of names like these? repl_completion binding_completion irb_completion repl_type_completion binding_type_completion irb_type_completion

Completion API

I think completion_candidates and doc_namespace api is enough for the first version. Although, it's worth providing categorized/detailed completion candidates in the future.

st0012 commented 11 months ago

I've been thinking the word synthesis, which means

the combination of components or elements to form a connected whole.

I think it matches the fact that we combine typing and binding information together to power the completion.

But I'm not sure if people would think the name synthesis_completor too abstract/difficult to understand 🤔

If the above idea doesn't sound good, I'd go with type_completor.

hasumikin commented 11 months ago

Naming is always the hardest thing :)

I don't have a strong opinion but I feel it's better to include irb. So how about irb_type_completor?

tompng commented 11 months ago

I slightly prefer using repl_ over irb_ because it can be also used in debugger's REPL-feature, but irb_ is ok for me. I like irb_type_completor because it looks straighforward and understandable that irb --type-completor needs it.

st0012 commented 11 months ago

Compared with irb_type_completor, I prefer repl_type_completor too so it's not too restricted to a single tool.

ima1zumi commented 11 months ago

I agree with Stan. I prefer repl_type_completor too.

hasumikin commented 11 months ago

repl_type_completor :+1:

tompng commented 11 months ago

Created https://github.com/ruby/repl_type_completor/

tompng commented 11 months ago

Finished by #772

st0012 commented 11 months ago

@tompng Thank you for the vision and the tremendous effort you put into it!

ruby / irb

Better completion using types #734

Background

PullRequest

Performance

Accuracy

Comparision with current RegexpCompletor

With RBS

In Rails apps

Other type analysis libraries

Where to have this code, IRB or bundled/default gem?

General Feedback

Public Rollout (Enabling by Default)

Questions about Maintenance and Future Development

Additional Thoughts/Concerns

708 can do something like this:

Naming

Completion API