Closed dongli closed 11 years ago
It would be easier to answer if you included the code for your parser, with the rules that match the input text and turn them into that tree. That is, the code in your class FortranParser < Parslet::Parser
.
The solution might be as simple as omitting .as(:id)
from your closing match of foo_t
.
Hi @roryokane ! You can clone the gist (https://gist.github.com/dongli/5791976) and run:
rspec --fail-fast rspec_fortran_parser.rb
See the last test. The rule is derived_type_definition
.
First of all, kudos for writing this parser :)
As @roryokane says, it might be as simple as omitting .as(:id)
– however, in your parser that is buried deeply on line 203:
spaced(( match('[a-zA-Z_]') >> match('[a-zA-Z0-9]').repeat >> match('[a-zA-Z0-9_]').repeat ).as(:id))
Usually it is best to not leave as
statements as deeply embedded, and have them describe more complex semantic structures. However, not sure if this is possible in your case.
Maybe you can push the as(:id)
away from the id node/rule?
Just to get you onto the idea of transforms, it is completely ok to transform the parse tree into another parse tree using transformers. For example, to get what you want, you could match only the case you describe and eliminate the id:
class Trans < Parslet::Transform
rule(:type_name => { :id => simple(:a) }, :new_line => simple(:b), :declarations => sequence(:c), :id => simple(:x)) do
{ :type_name => { :id => a }, :new_line => b, :declarations => c }
end
end
Then run it on your tree:
tree = Trans.new.apply tree
However, I'm not saying to use it in this case – just as a pointer that transforms can well be used on a parse tree to get a slightly pruned parse tree.
But usually (always?), solutions can be found in restructuring the parser.
Hi @floere ,
I will definitely use Transform
, but I think it would be convenient to have a hide
method or some thing like that for such purpose (maybe a lot of typings will be saved).
I put as(:id)
deeply there in order to distinguish from as(:template_instance)
from very beginning, since I would like to add a handy template
mechanism in Fortran.
BTW, I have almost achieved my goal with Treetop, but would like give Parslet a try. : )
You can create your own hide
method. First, write a Parlset atom that “forgets” the name assigned by as
:
class Anonymized < Parslet::Atoms::Base
attr_reader :parslet
def initialize(parslet)
super()
@parslet = parslet
end
def apply(source, context, consume_all)
success, value = result = parslet.apply(source, context, consume_all)
return result unless success
succ(
produce_return_value(
value))
end
def to_s_inner(prec)
"hidden(" + parslet.to_s(prec) + ")"
end
private
def produce_return_value(val)
flatten(val, true).first[1]
end
end
Anonymized
was modeled after Parslet::Atoms::Named
, the Parslet atom behind as()
.
To get a hide
helper method, you can either define a plain Ruby method:
def hide(parslet)
Anonymized.new(parslet)
end
or a Parslet DSL method (like the ones in dsl.rb):
module Parslet::Atoms::DSL
def hide
Anonymized.new(self)
end
end
And here’s how you can use your new hide
method in IRB:
>> require 'parslet'
=> true
>> # copy and paste `Anonymized` and `hide` here
>> include Parslet
=> Object
>> a = str('a')
=> 'a'
>> named_a = a.as(:a)
=> a:'a'
>> hidden_a = named_a.hide # using the DSL version of hide
=> hidden(a:'a')
>> a.parse('a')
=> "a"@0
>> named_a.parse('a')
=> {:a=>"a"@0}
>> hidden_a.parse('a')
=> "a"@0
Here’s how you can use Anonymizer
to fix derived_type_declaration
in your FortranParser
so it works with your example:
rule(:derived_type_declaration) {
( keyword('type') >> derived_type_attributes.maybe >> template_or_id.as(:type_name) >> new_line >>
declarations >>
tbp_declarations.maybe >>
keyword('end').as(:end) >> keyword('type') >> template_or_id ).as(:derived_type_declaration)
}
That changes your parse tree for the last RSpec test from
{:derived_type_declaration=>{:type_name=>{:id=>"foo_t"@5}, :new_line=>"\n"@10, :declarations=>[], :end=>"end"@11, :id=>"foo_t"@20}}
to
{:derived_type_declaration=>{:type_name=>{:id=>"foo_t"@5}, :new_line=>"\n"@10, :declarations=>[], :end=>"end"@11}}
hide
will hide any name, not just :id
– it will hide :template
, which may not be what you want. If you want to hide only :id
, you can try extending Anonymized
so that it takes a name
as parameter, like just like Parslet::Atoms::Named
does, and only hides that name. Or perhaps at that point, it would be better to use a Transformation than a custom atom.
There might be better names than Anonymized
and hide
; I chose the first names that came to mind.
The duplicate key warning is easy to remove - often you need just one more .as() - introducing structure that disambiguates two subhashes.
rule(:id) { ... } # .as(:id) in here
rule(:foo) { id.as(:id1) >> id.as(:id2) }
But of course hiding works just as well. If you want to contribute this to parslet (on the odd chance), please rename it to 'ignore' and Ignore.
Closing this issue, since this is also something that is better discussed on the mailing list.
Hi all,
I would like to make some matches not appear in the tree, since they are trivial and may cause duplicate keys. For example, in Fortran we have derived type:
The second
foo_t
should be dropped.The present parsed tree is:
The last
:id=>"foo_t"@20
is redundant. How could I make it disappear without duplicate key warning?