coatl / redparse

RedParse is a ruby parser written in pure ruby.
redparse.rubyforge.org
GNU Lesser General Public License v2.1
24 stars 2 forks source link

= RedParse

== DESCRIPTION:

RedParse is a ruby parser (and parser-compiler) written in pure ruby. Instead of YACC or ANTLR, it's parse tool is a home-brewed language. (The tool is (at least) LALR(1)-equivalent and the 'parse language' is pretty nice, even in it's current form.)

My intent is to have a completely correct parser for ruby, in 100% ruby. And I think I've more or less succeeded. Aside from some fairly minor quibbles (see below), RedParse can parse all known ruby 1.8 and 1.9 constructions correctly. Input text may be encoded in ascii, binary, utf-8, iso-8859-1, and the euc-* family of encodings. Sjis is not yet supported.

== INSTALL:

== LICENSE:

RedParse is available under the Library General Public License (LGPL). Please see COPYING.LGPL for details.

== Benefits:

== Drawbacks:

== SYNOPSIS:

simple example of usage:

require 'redparse'

parser=RedParse.new("some ruby code here") tree=parser.parse

tree.walk{|parent,i,subi,node| case node when RedParse::CallNode: #... do something with method calls when RedParse::AssignNode: #... maybe alter assignments somehow

.... and so on

end

}

presumably tree was altered somehow in the walk-"loop" above

when done mucking with the tree, you can turn it into one

of two other formats: ParseTree s-exps or ruby source code.

tree.to_parsetree #=> turns a tree into an ParseTree-style s-exp.

tree.unparse({}) #=> turns a tree back into ruby source code.

to understand the tree format, you must understand the node classes,

which are documented in the next section.

== NODE TYPES:

Syntax trees are represented by trees of nested Nodes. All Nodes descend from Array, and their subnodes can be addressed by numeric index, just like normal Arrays. However, many subnodes want to have names as well, thus most (but not all) array slots within the various Node classes have names. The general rule is that Node slots may contain a Node, a plain Array, a String, or nil. However, many cases are more specific than that. Specific Node classes are documented briefly below in this format:

NodeName #comments describing node (slot1: Type, slot2: Type) -OR- (Array[Type*])

Here's an example of how to use this imaginary Node:

if NodeName===node do_something_with node.slot1 do_something_else_with node.slot2

-OR-

do_something_with node[0] #slot1 do_something_else_with node[1] #slot2 end

Types are specified in an psuedo-BNF syntax. | * + ? all have the same meaning as in Regexp. Array[Spec] indicates a plain Array (not a Node). The Spec describes the constraints on the Array's contents. In the cases where node slots don't have names, there will be no colon- terminated slot name(s) on the second line, just an Array[] specification.

This is a final depiction of the syntax tree. There may be additions to the existing format in the future, but no incompatibility-creating changes.

Several abbreviations are used: Expr means ValueNode LValue means ConstantNode|VarNode|UnaryStarNode|CallNode| BracketsGetNode|AssigneeList[LValue*] UnAmpNode means UnOpNode with op == "&"

Node<Array #abstract ancestor of all nodes +-RescueNode #a rescue clause in a def or begin statement (exceptions: Array[Expr*], varname: VarNode nil, action: Expr) +-WhenNode #a when clause in a case statement (when: Expr Array[Expr+] then: Expr nil ) +-ElsifNode #an elsif clause in an if statement (elsif: Expr, then: Expr nil) +-ValueNode #abstract, a node which has a value (an expression) +-VarNode #represents variables and constants (ident: String) +-ListOpNode #abstract, ancestor for nodes which are lists of #things separated by some op +-SequenceNode #a sequence of statements (Array[Expr*]) +-ConstantNode #a constant expression of the form A::B::C or the like #first expression can be anything (Array[String Expr nil,String+]) +-RawOpNode #ancestor of all binary operators (except . :: ; , ?..:) (left: Expr, op: String, right: Expr) +-RangeNode #a range literal node +-KeywordOpNode #abstract, ancestor of keyword operators +-LogicalNode #and or && expressions +-WhileOpNode #while as an operator +-UntilOpNode #until as an operator +-IfOpNode #if as an operator +-UnlessOpNode #unless as an operator +-RescueOpNode #rescue as an operator (body: Expr, rescues: Array[RescueNode*]) +-OpNode #ancestor of some binary operators (those with methods hidden in them) +-NotEqualNode #!= expressions +-MatchNode #=~ expressions +-NotMatchNode #!~ expressions +-LiteralNode #literal symbols, integers (val: Numeric Symbol StringNode) +-StringNode #literal strings (Array[(String Expr)+]) +-HereDocNode #here documents +-StringCatNode #adjacent strings are catenated ("foo" "bar" == "foobar") (Array[StringNode+]) +-NopNode #an expression with no tokens at all in it (no attributes) +-VarLikeNode #nil,false,true,FILE,LINE,self (name: String) +-UnOpNode #unary operators (op: String, val: Expr) +-UnaryStarNode #unary star (splat) +-DanglingStarNode #unary star with no argument (no attributes) +-DanglingCommaNode #comma with no rhs (no attributes) +-BeginNode #begin..end block (body: Expr nil, rescues: Array[RescueNode*], else: Expr nil, ensure: Expr nil) +-ParenedNode #parenthesized expressions (body: Expr) +-AssignNode #assignment (including eg +=) (left:AssigneeList LValue, op:String ,right:Array[Expr*] Expr) +-AssigneeList #abstract, comma-delimited list of assignables (Array[LValue*]) +-NestedAssign #nested lhs, in parentheses +-MultiAssign #regular top-level lhs +-BlockParams #block formal parameter list +-CallSiteNode #abstract, method calls (receiver: Expr nil, name: String, params: nil Array[Expr+,UnaryStarNode?,UnAmpNode?], blockparams: BlockParams nil, block: Expr nil) +-CallNode #normal method calls +-KWCallNode #keywords that look (more or less) like methods (BEGIN END yield return break continue next) +-ArrayLiteralNode #[..] (Array[Expr*]) +-IfNode #if..end and unless..end (if: Expr, then: Expr nil, elsifs: Array[ElsifNode+] nil, else: Expr nil) +-LoopNode #while..end and until..end (while: Expr, do: Expr:nil) +-CaseNode #case..end (case: Expr nil, whens: Array[WhenNode*], else: Expr nil) +-ForNode #for..end (for: LValue, in: Expr, do: Expr nil) +-HashLiteralNode #{..} (Array[Expr*]) (size must be even) +-TernaryNode # ? .. : (if: Expr, then: Expr, else: Expr) +-MethodNode #def..end (receiver:Expr nil, name:String, params:Array[VarNode,AssignNode,UnaryStarNode?,UnAmpNode?] nil, body: Expr nil, rescues: Array[RescueNode+] nil, else: Expr nil, ensure: Expr nil) +-AliasNode #alias foo bar (to: String VarNode StringNode, from: String VarNode StringNode) +-UndefNode #undef foo (Array[String StringNode+]) +-NamespaceNode #abstract +-ModuleNode #module..end (name: VarNode ConstantNode, body: Expr nil rescues: Array[RescueNode+] nil, else: Expr nil, ensure: Expr nil) +-ClassNode #class..end (name: VarNode ConstantNode, parent: Expr nil, body: Expr nil, rescues: Array[RescueNode+] nil, else: Expr nil, ensure: Expr nil) +-MetaClassNode #class<<x..end (val: Expr, body: Expr nil, rescues: Array[RescueNode+] nil, else: Expr nil, ensure: Expr nil) +-BracketsGetNode #a[b] (receiver: Expr, params: Array[Expr+,UnaryStarNode?] nil)

ErrorNode #mixed in to nodes with a syntax error +-MisparsedNode #mismatched braces or begin..end or the like

== REQUIREMENTS:

== Known problems with the parser:

== Known problems with the unparser:

== Known problems with ParseTree creator

== Bugs in ruby

== Bugs in ParseTree

== Copyright redparse - a ruby parser written in ruby Copyright (C) 2008,2009, 2012, 2016 Caleb Clausen

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Lesser General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public License
along with this program.  If not, see <http://www.gnu.org/licenses/>.