Closed chrismwendt closed 4 years ago
I would like to keep parslet minimal - why don't you create a parslet-extensions gem for this?
The 'one way (tm)' for this that parslet already has is by transforming the output tree, not mapping during the parse phase. This way, you'll map only what is relevant, instead of every succeeded partial tree.
I gave this some thought, and in short, I think map
should replace Transform
because map
is composable and Transform
is not.
Imagine you have 2 parsers, date
and time
, along with their corresponding date_transform
and time_transform
Transform
s for turning them into Date
and Time
objects, respectively:
date = digit.repeat(4).as(:year) >> str('-') >> ...
time = digit.repeat(2).as(:hour) >> str(':') >> ...
date_transform = Parslet::Transform.new do
rule({ :year => simple(:year), :month => ... }) { { :date => Date.new(year, month, day) } }
end
time_transform = Parslet::Transform.new do
rule({ :hour => simple(:hour), :minute => ... }) { { :time => Time.new(hour, minute, second) } }
end
Consider how you would construct a new date_time
parser and the corresponding Transform
for turning them into DateTime
objects. I can think of 2 options:
Option 1: you could write a new Transform
with 3 rules in doing so duplicate the implementations of date_transform
and time_transform
:
date_time = (date >> str(' ') >> time).as(:date_time)
date_time_transform = Parslet::Transform.new do
rule({ :year => ... }) { ... } # same as before
rule({ :hour => ... }) { ... } # same as before
rule({ :date_time => { :date => subtree(:date), :time => subtree(:time) } }) do
DateTime.new(date.year, date.month. date.day, time.hour, time.minute, time.second)
end
end
Option 2: or you could take a hybrid approach and apply date_transform
and time_transform
in sequence, followed by a third transform to construct the DateTime
objects:
date_time = date >> str(' ') >> time
date_time_transform = lambda do |ast|
temp = date_transform.apply(time_transform.apply(x))
Parslet::Transform.new do
rule({ :date_time => { :date => subtree(:date), :time => subtree(:time) } }) do
DateTime.new(date.year, date.month. date.day, time.hour, time.minute, time.second)
end
end.apply(temp)
end
Both of these approaches are quite ugly and hacky. Now consider how you would do this with map
:
date = (digit.repeat(4).as(:year) >> str('-') >> ...).map(lambda { |ast| Date.new(ast[:year], ...) })
time = (digit.repeat(2).as(:hour) >> str(':') >> ...).map(lambda { |ast| Time.new(ast[:hour], ...) })
date_time = (date.as(:date) >> str(' ') >> time.as(:time)).map(lambda { |ast| DateTime.new(ast[:date].date, ...) })
Done! :tada:
Admittedly, switching from Transform
to map
would be a foundational change to Parslet. I'm actually not in dire need of this - I just wanted to show you what it might look like in case it sparks some interest :smile:
Point taken - transforms do not currently compose well. They could (based on internal constructions) though, but that's another PR ;)
Here's what's bothering me with mapping over internal tree values directly:
So I think this approach is doomed for 'core' ;)
How about attaching map blocks to result values that can then be executed once we know what values end up in the 'real' result? A kind of delayed map? I would be favorable to merging an alternative to transformers into parslet, just to give people options - provided safety doesn't suffer.
Oh and: Thanks for taking an interest and sticking with the discussion. I value your contribution a lot.
Good points about speed and safety, and the "delayed map" idea sounds promising. It reminds me of lazy evaluation in Haskell (from which I'm drawing the inspiration for this mapping ability 😺 ).
It would be slightly inconvenient to make users call some kind of .finalize()
method in addition to .parse()
in order to get the final results, but I can't think of how else to do it.
I guess you would need to experiment with the idea to advance this.
Closing this; original author did not pursue further.
This adds the ability to map over parse results. This is especially useful for parsing integers into Ruby
Fixnum
s, or composing integers together to formDateTime
s.Unfortunately, the mapping function currently needs to be aware of the internal structure of Parslet's parse tree. That could be fixed, but I imagine it's not easy to do. I'm mostly leaving this here as a proof of concept - I don't expect this to be merged as-is.