ruby / irb

interactive Ruby
BSD 2-Clause "Simplified" License
390 stars 119 forks source link

Better completion using types #734

Closed tompng closed 11 months ago

tompng commented 1 year ago

Background

Current RegexpCompletor has many limitations. It can't complete chained methods. Many wrong completions. For better completion, we need to parse the incomplete input and perform type analysis. We need to use RBS and an error tolerant parser like Prism.

PullRequest

WIP pull request is opend in #708. Fallbacks to RegexpCompletor if Prism is not available.

Performance

RBS loading performance depends on rbs_collection.yml and project's rbs size under sig/ dir. It takes 0.168721s when there is no yml and sig. Completion performane depends on code size editing in irb. For example below, it only analyze "n.times do\n _1." with binding that knows the actual value of n.

irb(main):123> n.times do
irb(main):124>   _1.█

Analyzing 1161 lines takes 0.072726s. For comparision, coloring code takes 0.057202s

code = File.read 'lib/irb/type_completion/type_analyzer.rb'
# Force to analyze every single nodes
class IRB::TypeCompletion::TypeAnalyzer::DigTarget; def dig?(node)=true; end
measure
IRB::TypeCompletion::Completor.new.completion_candidates code, '.', '', bind: binding;
# => processing time: 0.072726s
IRB::Color.colorize_code(code);
# => processing time: 0.057202s

Accuracy

Comparision with current RegexpCompletor

As far as I know, the new completor can complete every case that RegexpCompletor can complete, even if RBS is not available.

# Completion without RBS that RegexpCompletor can't complete

def foo(n = 0)
  a = n; a = 1.0 if b; a.█ # Integer and Float methods
end

class User < ApplicationRecord
  has_█ # has_many, has_one
  def bar
    sa█ # save, save!
  end
end

module IRB
  A█ # IRB.constants and Object.constants
  @@a█ # IRB's class variable names
end

With RBS

When RBS is available, it can complete more.

# Completion with RBS

'a'.bytes.shuffle.join.encoding.█ # Encoding methods

10.times.map(&:to_s).map { _1.to_sym }.map(&:█) # Symbol methods

object.instance_eval do
  @a█ # object's instance method names
  self.█ # object's methods
end

%w[a b c].each_with_index do
  s = [:s] * _1
  a = [:a] * _2
  s[0].█ # String methods
  a[0].█ # Symbol methods
end

In Rails apps

Adding rbs_collection.yaml and sig/*.rbs to the Rails project will significantly improve accuracy. Article.first.author.email.und can complete String#underscore if types are defined in project's sig directory. Providing good completion experience in rails console will be a motivation for users to write .rbs in their project.

Other type analysis libraries

This completor is a REPL-first type analyzer. It uses few information about the project codebase and many dynamic information from binding. Other type analyzer uses lots of project's code and no binding information. I think it's hard to use another type analyzer and integrate with binding information. Maybe this could change in the future.

Where to have this code, IRB or bundled/default gem?

I want to have this code inside IRB's repository.

I want to enable it by default, so this code should be inside IRB or default/bundled gem, not normal gem that needs installation. Want to completely replace RegexpCompletor in the future, I think default completor can be controlled inside IRB. I think the possible user of REPL-first completor is only IRB. It's easy to decide and have it inside IRB. As a bundled or default gem, it should be discussed in bugs.ruby-lang.org. My opinion is, let's include it, get feedbacks, consider splitting it or not later.

st0012 commented 1 year ago

To help us better test this feature, what do you think about adding rbs files to IRB and add a CI step to typecheck it? Also, do you know if there are some open source projects that use RBS?

tompng commented 1 year ago

CI step to typecheck

Looks good. I don't know if its easy or hard, but we should try it.

Lrama https://github.com/ruby/lrama started to use RBS and Steep List of gems using RBS as a library https://rubygems.org/gems/rbs/reverse_dependencies

st0012 commented 1 year ago

Firstly, thank you for the considerable effort you've put into this feature ❤️
Utilizing the power of typing in IRB is indeed a significant leap forward not just for IRB, but for the language as well.

General Feedback

  1. I propose we merge this feature before Ruby 3.3.
  2. However, I am against enabling it by default in 3.3.

The reason for the latter is due to the critical nature of this feature in the users' workflow. If anything goes wrong, it could heavily impact our users and potentially erode their trust in IRB. The introduction of the current autocompletion with Ruby 3.1, which was enabled by default, led to numerous issues as seen in #445. Although I believe this feature will be better tested and reviewed with our team of IRB maintainers, I'd like to avoid a similar situation as much as possible.

Public Rollout (Enabling by Default)

Personally, I think we can aim for the end of next year. However, before that, we need to:

Hopefully by then we can drop Ruby 2.7 and add Prism as a dependency too.

Questions about Maintenance and Future Development

With the completion results now depending on a variety of factors, such as users' type signatures, community-contributed signatures, and the type analyzer itself, we need to consider:

Additional Thoughts/Concerns

tompng commented 1 year ago

How can we assist users in reporting issues and effectively investigate them?

Add Completion section to irb_info command. This is implemented in #708

Completion: Tab Complete, RegexpCompletor
Completion: Autocomplete, RegexpCompletor
Completion: Autocomplete, TypeCompletion::Completor(Prism: 0.16.0, RBS: loading)
Completion: Autocomplete, TypeCompletion::Completor(Prism: 0.16.0, RBS: 3.3.0.pre.2)
Completion: Autocomplete, TypeCompletion::Completor(Prism: 0.16.0, RBS: #<LoadError: cannot load such file -- rbs>)
Completion: Autocomplete, TypeCompletion::Completor(Prism: 0.16.0, RBS: #<RBS::Collection::Config::CollectionNotAvailable:"rbs collection is not initialized.\nRun `rbs collection install` to install RBSs from collection.\n">)

Given that the type-analyzing mechanism will differ from Steep’s, how do we determine if a result is legitimate or not?

Type analyzing is simple (less functional) than Steep. We should determine fixing the difference is possible in the simplified specification or not. (aims to avoid extra maintenance cost)

How do we evaluate if the feature is worth the extra maintenance effort?

Measure how many user enabling and re-disabling it. Any idea of comparing with extra maintenace effort?

Is there a contingency plan to replace this feature if our answer to the above question is negative?

We can simplify some of the analysis that causing maintenance problem. example:

Since the debug gem also uses IRB’s completion, do you plan to collaborate with its maintainer Koichi to adopt this feature there?

Yes, we should. Not thinking how to yet. Maybe we can add an API to get appropriate completor

st0012 commented 1 year ago

Add Completion section to irb_info command.

That's helpful and I did notice that. But what I'm asking is in a broader sense, like:

Type analyzing is simple (less functional) than Steep.

It will be great to have a few examples demonstrating this in the readme so we set the right expectation to other maintainers and our users.

I think there could also be cases where it's more powerful than Steep given we have runtime info? Examples of them should be mentioned too.

Measure how many user enabling and re-disabling it. Any idea of comparing with extra maintenace effort?

I think we have limited options to measure the benefit it brings to the community. Code searching may not be too helpful because if people enable this at project-level .irbrc, we generally won't see it. Perhaps the most effective way is to host polls on social media platforms, maybe around next RubyKaigi?

Wrt the extra effort, I think the we can get a sense of it by counting PRs associated with these components.

I haven't think of a way to compare the two though. It's likely that we won't be able to make any clear rule on this, but it's important that we keep this in mind.

We can simplify some of the analysis that causing maintenance problem. example:

Thank you for these examples 👍

Not thinking how to yet. Maybe we can add an API to get appropriate completor

That integration could require some extra work because currently its console can't talk to individual threads directly.

tompng commented 1 year ago

For reporting issues, single file repro is hard because of the setup to create binding(self object, module nesting, variables) and environment(sig, rbs_collection.yaml). I think issue template works. These information will help us.

have a few examples demonstrating this in the readme

host polls on social media platforms

Looks good :+1:

st0012 commented 1 year ago

@tompng Can you start another PR, maybe on top of #708, to draft the documentation for this feature?

I think we need to cover these:

ima1zumi commented 1 year ago

I read part of IRB#708 and this discussion. Thanks for making the Issue.

My thoughts

hasumikin commented 1 year ago

I vaguely think that it would be helpful to have a debugging feature that displays something like a backtrace of why the current list of autocomplete candidates was generated.

st0012 commented 1 year ago

In Sorbet, there's a T.reveal_type helper:

Screenshot 2023-11-06 at 14 16 47

I think if we have something like irb_reveal_type(my_var) and simply prints the type of my_var, it will be very helpful too.

tompng commented 1 year ago

708 can do something like this:

irb(main):001>  a, b = 1, 'a'
=> [1, "a"]

irb(main):002> IRB::TypeCompletion::Completor.new.analyze('[a, b].map(&:a', binding)
=> [:call, Integer | String, "a", false]

irb(main):003> res = IRB::TypeCompletion::Completor.new.analyze('a.times {|aa| a', binding)
=> [:lvar_or_method, "a", #<IRB::TypeCompletion::Scope:0x0000000107740dd0>]

irb(main):004> res[2].local_variables
=> ["aa", "res", "a", "b", "_"]

irb(main):005> res[2].self_type
=> Object

I think adding a method or command that takes (code_to_complete, binding) and shows formatted output of these can help.

st0012 commented 1 year ago

Now asking to change the current implementation, but I think a benefit of extracting the completor implementation to a gem is that it will give us better dependency control and simpler implementation:

With the current approach, we need to passively check the version of prism installed by users and can't really bind upper bound restriction to rbs and prism to avoid breaking changes. This means that we always need to support the latest rbs and prism change and having ugly version branch in the implementation if we want to maintain backward compatibility.

If type_completion is a separate gem, we can add rbs and prism as dependencies like what we usually do and avoid the above problem. For example:

And in IRB, we simply ask users to have type_completion installed and don't care about prism or rbs changes.

tompng commented 12 months ago

I see. That's right, it makes sense. I think we need to decide name and the api.

Naming

As a separate gem, the name type_completion is too generic because it's a completion only for binding-provided environment, not for editors or other. What do you think of names like these? repl_completion binding_completion irb_completion repl_type_completion binding_type_completion irb_type_completion

Completion API

I think completion_candidates and doc_namespace api is enough for the first version. Although, it's worth providing categorized/detailed completion candidates in the future.

st0012 commented 11 months ago

I've been thinking the word synthesis, which means

the combination of components or elements to form a connected whole.

I think it matches the fact that we combine typing and binding information together to power the completion.

But I'm not sure if people would think the name synthesis_completor too abstract/difficult to understand 🤔

If the above idea doesn't sound good, I'd go with type_completor.

hasumikin commented 11 months ago

Naming is always the hardest thing :)

I don't have a strong opinion but I feel it's better to include irb. So how about irb_type_completor?

tompng commented 11 months ago

I slightly prefer using repl_ over irb_ because it can be also used in debugger's REPL-feature, but irb_ is ok for me. I like irb_type_completor because it looks straighforward and understandable that irb --type-completor needs it.

st0012 commented 11 months ago

Compared with irb_type_completor, I prefer repl_type_completor too so it's not too restricted to a single tool.

ima1zumi commented 11 months ago

I agree with Stan. I prefer repl_type_completor too.

hasumikin commented 11 months ago

repl_type_completor :+1:

tompng commented 11 months ago

Created https://github.com/ruby/repl_type_completor/

tompng commented 11 months ago

Finished by #772

st0012 commented 11 months ago

@tompng Thank you for the vision and the tremendous effort you put into it!