mbj / unparser

Turn Ruby AST into semantically equivalent Ruby source
MIT License
310 stars 52 forks source link

safe navigation operator discarded in arithmetic context #345

Closed csabahenk closed 1 year ago

csabahenk commented 1 year ago

Versions:

$ for s in 'a.foo(b)' 'a&.foo(b)' 'a.+(b)' 'a+b' 'a&.+(b)' ;  do echo "$s"; done | \
ruby -runparser -ne '$_.chomp.then { |s|
  puts [Unparser.unparse(*(Unparser.parse_with_comments s)), "# source", s].join(" ")
}'

Output:

a.foo(b) # source a.foo(b)
a&.foo(b) # source a&.foo(b)
a + b # source a.+(b)
a + b # source a+b
a + b # source a&.+(b)

Expected: unparser renders a.+(b) and a&.+(b) to distinct, pairwise equivalent experssions. Actual: unparser renders a.+(b) and a&.+(b) to the same expression (equivalent only to the former).

Note that unparser handles the safe navigation operator properly for "normal" method calls which are not in use as infix pseudo-operators.


I also checked if the bug propagates down to parser:

$ for s in 'a.foo(b)' 'a&.foo(b)' 'a.+(b)' 'a+b' 'a&.+(b)' ;  do echo "$s"; done | \
ruby -rparser/current -rdigest/md5 -ryaml -ne 'BEGIN { $acc=[] }
$_.chomp.then { |s|
  Parser::CurrentRuby.new.parse(Parser::Source::Buffer.new("(string)", source: s)).inspect.then { |ani|
    $acc << Hash["expr"=>s, "parsedmd5"=> Digest::MD5.new.tap{|m| m << ani }.to_s, "parsed"=>ani]
  }
}  
END { $acc.to_set.classify { |r| r["parsedmd5"] }.transform_values(&:to_a).tap { |t| YAML.dump t, $> }}'

Output:

---
b37ae580875577098e664154610a21fe:
  - expr: a.foo(b)
    parsedmd5: b37ae580875577098e664154610a21fe
    parsed: |-
      s(:send,
        s(:send, nil, :a), :foo,
        s(:send, nil, :b))
8bc7c3994c8548cead4da4b9f36a8568:
  - expr: a&.foo(b)
    parsedmd5: 8bc7c3994c8548cead4da4b9f36a8568
    parsed: |-
      s(:csend,
        s(:send, nil, :a), :foo,
        s(:send, nil, :b))
cfccbdaedfa73c9e97cc26f7ced79e8b:
  - expr: a.+(b)
    parsedmd5: cfccbdaedfa73c9e97cc26f7ced79e8b
    parsed: |-
      s(:send,
        s(:send, nil, :a), :+,
        s(:send, nil, :b))
  - expr: a+b
    parsedmd5: cfccbdaedfa73c9e97cc26f7ced79e8b
    parsed: |-
      s(:send,
        s(:send, nil, :a), :+,
        s(:send, nil, :b))
10f949be7faf629eb6ea7b892f2a75e2:
  - expr: a&.+(b)
    parsedmd5: 10f949be7faf629eb6ea7b892f2a75e2
    parsed: |-
      s(:csend,
        s(:send, nil, :a), :+,
        s(:send, nil, :b))

Expected: parser parses s a.+(b) and a&.+(b) to distinct ASTs. Actual: meets the expectation.

mbj commented 1 year ago

@csabahenk Reproduced via unparser native self check system:

bundle exec bin/unparser -e 'a.foo(b)' -e 'a&.foo(b)' -e 'a.+(b)' -e 'a+b' -e 'a&.+(b)'
Success: (string)
Success: (string)
Success: (string)
Success: (string)
(string)
Original-Source:
a&.+(b)
Generated-Source:
a + b
Original-Node:
(csend
  (send nil :a) :+
  (send nil :b))
Generated-Node:
(send
  (send nil :a) :+
  (send nil :b))
Node-Diff:
@@ -1 +1 @@
-(csend
+(send

Error: (string)
mbj commented 1 year ago

@csabahenk This is fixed in release v0.6.8.

OT: What are you using unparser for? I'm still on the search for the people using unparser so much: https://bestgems.org/gems/unparser its being used outside of mutant a lot, where the known OSS dependencies only attribute for a small fraction. Is that you?

csabahenk commented 1 year ago

Thanks for the quick fix!

@mbj I tend to accomplish various data mangling tasks with ruby, as shell one-liners or in Irb... These code sinppets exhibit a tendency to grow and at some point the one-liner format becomes unpractical.¹ Then I expand them to a well-indented code sinppet with unparser.

As I find Ruby to be a killer tool for scripting tasks, unparser became a cornerstone of my interactive usage... However, it would be way too sad for unparser if this would be a visible fraction of its usage :)


¹ I can even be mode detailed about where that point usually lies: I picked up the style of writing Ruby from letf to right... rarely editing what was written, just extending the data pipeline. This style can be smoothly practised without hitting a line break. However, if the code logic needs a substantial tweak, I do need to go back to the middle to modify stuff in there... at that point the existing code stuffed in one line looks to be an inpenetrable sigil soup :) So then, unparser to the rescue! (Also with unparser I can mangle the code back to a one-liner (using the property of uparser generated code being semantically resilient to a s/\n/;/g substitution), so after the tweak I can continue where I left off...)

mbj commented 1 year ago

@csabahenk Hah, thanks for that details. BTW you can run bundle exec unparser **/*.rb in your ruby files to mass check all files against the round trip property, feel free to bring up any error here. I usually fix all but the freaking dynamic string ones where the AST is not good enough to reproduce.

I doubt you do ~40k downloads/d on your described use case. And I think there is some industrial user out there whom I'd love to get in touch with. If that is real volume from CI etc, I assume they'd love to sponsor this work.

Cheers,

Markus