gazayas / masamune-ast

A covenience wrapper around Prism, a Ruby source code parser
MIT License
13 stars 1 forks source link

Use syntax_tree statements to display more precise node information #11

Closed gazayas closed 1 year ago

gazayas commented 1 year ago

Had a talk here at Ruby Kaigi with a Ruby committer and they said Ripper is semi-deprecated and that syntax_tree is a better option so I'd like to use it.

Since this gem is still in the early stages of development it shouldn't be that hard to implement, so I plan to prioritize this overall #10.

gazayas commented 1 year ago

Also considering the referral gem.

gazayas commented 1 year ago

Actually, now that I'm looking over syntax_tree, I feel like it would take more work than just using Ripper. It's a great interface, but I don't think it'll currently be able to handle what I'm looking for.

syntax_tree CLI

At first I thought it would be great because using the > stree expr file_name.rb returns an easy-to-look at AST like this:

10.times do |n|
  puts n
end
> stree expr file_name.rb
SyntaxTree::MethodAddBlock[
  call: SyntaxTree::CallNode[
    receiver: SyntaxTree::Int[value: "10"],
    operator: SyntaxTree::Period[value: "."],
    message: SyntaxTree::Ident[value: "times"]
  ],
  block: SyntaxTree::BlockNode[
    block_var: SyntaxTree::BlockVar[
      params: SyntaxTree::Params[requireds: [SyntaxTree::Ident[value: "n"]]]
    ],
    bodystmt: SyntaxTree::BodyStmt[
      statements: SyntaxTree::Statements[
        body: [
          SyntaxTree::Command[
            message: SyntaxTree::Ident[value: "puts"],
            arguments: SyntaxTree::Args[
              parts: [SyntaxTree::VarRef[value: SyntaxTree::Ident[value: "n"]]]
            ]
          ]
        ]
      ]
    ]
  ]
]

This is much easier to decipher as opposed to Ripper#sexp's output, but the problem is you can only do one expression at a time. For example if I edit the code like this...

10.times do |n|
  puts n
end
# Adding a comment here.

It will yield the following message:

> stree expr file_name.rb
The input to `stree expr` must be a single expression.

This won't work since we're trying to use multiple expressions. Also, the SyntaxTree library doesn't offer the same AST output, which I would like to use in a Ruby file as opposed to the CLI but have not found out how to implement.

syntax_tree's SyntaxTree library

I wanted to use the SyntaxTree library instead, but it doesn't have the same behavior as the CLI, and SyntaxTree.parse outputs information very similar to what Ripper#sexp does:

[1] pry(#<TestMasamune>)> code = <<~CODE
[1] pry(#<TestMasamune>)* 10.times do |n|
[1] pry(#<TestMasamune>)*   puts n
[1] pry(#<TestMasamune>)* end
[1] pry(#<TestMasamune>)* CODE
=> "10.times do |n|\n  puts n\nend\n"
[2] pry(#<TestMasamune>)> SyntaxTree.parse code
=> (program
  (statements
    ((method_add_block
        (call (int "10") (period ".") (ident "times"))
        (block (block_var (params ((ident "n")))) (bodystmt (statements ((command (ident "puts") (args ((var_ref (ident "n")))))))))))))

However, the line positions which are necessary for the Masamune replacement logic to work aren't here. If I have to write more code for syntax_tree to perform the commands I want it to do, I figure I might as well continue working on this library and keep working with Ripper until it's completely deprecated.

Side Note

syntax_tree also uses Ripper anyways: https://github.com/ruby-syntax-tree/syntax_tree/blob/2ae6d6cc8412b74f0dcca7747ed5cb1c22d7907b/lib/syntax_tree.rb#L142

gazayas commented 1 year ago

Okay, I think I found a meaningful way to put this together with syntax_tree:

code = <<~CODE
10.times do |n|
  puts n
end
# Add comment to show multiple expressions can be handled.
CODE

program = SyntaxTree.parse(code)
program.statements.body.each {|stmt| puts stmt.construct_keys.split("\n") }

This will print out the tree, and contruct_keys gives a name to each portion:

SyntaxTree::MethodAddBlock[
  call: SyntaxTree::CallNode[
    receiver: SyntaxTree::Int[value: "10"],
    operator: SyntaxTree::Period[value: "."],
    message: SyntaxTree::Ident[value: "times"]
  ],
  block: SyntaxTree::BlockNode[
    block_var: SyntaxTree::BlockVar[
      params: SyntaxTree::Params[requireds: [SyntaxTree::Ident[value: "n"]]]
    ],
    bodystmt: SyntaxTree::BodyStmt[
      statements: SyntaxTree::Statements[
        body: [
          SyntaxTree::Command[
            message: SyntaxTree::Ident[value: "puts"],
            arguments: SyntaxTree::Args[
              parts: [SyntaxTree::VarRef[value: SyntaxTree::Ident[value: "n"]]]
            ]
          ]
        ]
      ]
    ]
  ]
]
SyntaxTree::Comment[
  value: "# Add comment to show multiple expressions can be handled."
]

Figured this out by looking at the Expr class.

This should work, but I just want to make sure it's not missing any nodes and covering all of the code being passed to it.

gazayas commented 1 year ago

This still doesn't solve the line position issue, but it helps with readability.

gazayas commented 1 year ago

Now that we can get line positions and tokens from DataNodes, I want to consider how we can use syntax_tree's statements to provide useful information to developers.

Basically, the same as the comment above:

program = SyntaxTree.parse(code)
program.statements.body.each {|stmt| puts stmt.construct_keys.split("\n") }

I'm thinking we could combine this msmn.pry #23.

gazayas commented 1 year ago

Closing in favor of #52 (See this comment)