whitequark / parser

A Ruby parser.
Other
1.58k stars 198 forks source link

ruby/ruby@4adb012 #826

Closed iliabylich closed 2 years ago

iliabylich commented 2 years ago

https://github.com/ruby/ruby/commit/4adb012926f8bd6011168327d8832cf19976de40

iliabylich commented 2 years ago

Looks like this commit requires changes that don't match our current AST format.

I don't want to introduce a new AST node, what's going to happen if we make name of the blockarg Node nullable? That'd break everything, right?

/cc @marcandre @koic @mbj @palkan @pocke any thoughts? Initially I had an idea that it's possible to make it nullable behind a compatibility flag and emit s(:blockarg, :anonymous) when Builder.emit_anonymous_blockarg_as_named is set to false and s(:blockarg) when it's true. Same on the call site. WDYT? We could set expr_l to name_l fo such case, so locations are easy to "fix".

How many libraries do blindly call Builder.modernize? Now I think this method is conceptually wrong.

mbj commented 2 years ago

I don't want to introduce a new AST node, what's going to happen if we make name of the blockarg Node nullable?

Yes, but while the semantics of the AST did not really change, this is a case our AST representation should change if we wanted to be able to replicate the lvar scope from the AST (unparsers domain).

That'd break everything, right?

So I'm fine with making a breaking change to the AST format, making the name nullable and teach unparser that. Would not be the first time.

mbj commented 2 years ago

If we did this:

s(:blockarg, :anonymous)

this breaks tools that analyize the lvar scope:

def foo(&)
  anonymous = some_expr # anonymous overwritten from tools analyzing the lvar scope on that compatibility AST proposal
end

Correctly describing the lvar scope at any time is one of the duties of the AST, hence: IMO we have to accept a breaking change to the AST if ruby did so.

marcandre commented 2 years ago

To detail what @mbj wrote, it is ok if previously unwritable Ruby code produces previously unobtainable AST, as long as all previously writable code keeps producing the same AST. Ideally, the number of children of existing AST types does not decrease, but even that shouldn't be considered a hard compatibility rule.

So no compatibility flag required here, as there was never a different AST produced for def foo(&).

In this case, how about we nilify the corresponding parameters?

$ ruby-parse -e "def foo(&)= bar(&)"
(def :foo
  (args
    (blockarg nil))
  (send nil :bar
    (block-pass nil)))

This is possible as neither nils require a location.

marcandre commented 2 years ago

Alternative would be to have the block-pass be something like (block-pass (unnamed-block)). This could make upgrading the dependents of parser a bit easier (since some may assume that block-pass always have a single Node child), but that new node would have to have an empty location which is something that parser has been reticent to do.

For Rubocop: either would require extremely minimal changes. Details: For rubocop-ast, there's a single trivial change to do if block-pass can have a nil child (as we optimize traversal to the point of assuming it is a Node). For rubocop itself, only 9 cops refer to block-pass and none appear to care what the child is, except for one that is very limited in scope, for Ruby < 3 only, and uses the child's location for dubious reasons (NonDeterministicRequireOrder).

palkan commented 2 years ago

I'm not using blockarg explicitly anywhere, so making it nullable shouldn't hurt. However, adding a compatibility flag (at least for the transition period) makes sense.

emit s(:blockarg, :anonymous) when Builder.emit_anonymous_blockarg_as_named is set to false

Is there a typo? Shouldn't it be true?

Also, what about using s(:blockarg, :_) for that?

iliabylich commented 2 years ago

Thanks everyone for your feedback.

I think what @marcandre says makes a lot of sense.

parser for Ruby < 3.1 will not be affected by making blockarg nullable. parser for Ruby >= 3.1 will be affected, but only if new syntax is used.

So if you don't use new syntax you don't need to adapt your code. If you use it - you need to adapt, but it's also required for new node types in absolutely the same way.