Closed michaelmior closed 10 years ago
One way to solve this would be to use a transformer to transform single arguments into an argument list, like so:
require 'parslet'
class Mini < Parslet::Parser
rule(:argument) { match('[a-z]').repeat.as(:argument) }
rule(:arglist) { argument >> (str(',') >> argument).repeat }
rule(:funcall) { arglist.as(:arglist) }
root(:funcall)
end
multiple = Mini.new.parse("abc,def")
single = Mini.new.parse("abc")
transformer = Parslet::Transform.new do
rule(:arglist => { :argument => simple(:arg) }) { { :arglist => [{ :argument => arg }] } }
end
p transformer.apply multiple # => {:arglist=>[{:argument=>"abc"@0}, {:argument=>"def"@4}]}
p transformer.apply single # => {:arglist=>[{:argument=>"abc"@0}]}
That's true. Although it also means that any code using that parser needs to apply a particular transformer. For me it feels like a common enough use case to push down to the parser. The exact problem has resulted in very verbose code when I've used other parsing frameworks for Ruby. It would be nice to have a simple solution.
I whipped up something quickly at michaelmior/parslet@fb595e611e824a6db4fc4f54d4f4a62ddd85b29c. It will break if each atom in the value is also an array. I'm sure there's a way to work around this with a better understanding of the unflattened atom structure.
I know this is possible in code, but parslet already offers transformers. I try to keep things down to a bare minimum (or my definition of it), so thanks but no. For everyone, approaches I recommend here are:
a) A transformer, because it ships with parslet b) Including code like pasted above into your codebase and just enhancing parslet with it
And - the array/singleton hickup was a design choice. Main reason why it stays in.
@kschiess Fair enough. Although I wasn't suggesting changing existing behaviour, just a simple addition of a few lines to handle this use case which seems fairly common to me. But I respect the minimalism and transformers are also a decent way to handle this :)
Well... I tried @floere suggestion above but have not been able to get it to work.
I have a parsed element:
{:mc=>{:spid=>"17560"@0}, :bc=>{:spid=>"16699"@7}}
and want to transform each of these single elements into an array.
I have the transform:
rule(:mc => { :spid => simple(:arg) } ) { { :mc => [ { :spid => arg } ] } }
(mc only for testing right now), but all I get back is
{:mc=>{:spid=>"17560"@0}, :bc=>{:spid=>"16699"@7}}
.
@JESii Transforms have to match the entire subtree. Your rule only matches the key :mc
but the parse tree also has the key :bc
. You would need to have your rule match a parse tree with both of those keys. You could then turn both into an array at the same time.
Thanks, but I'm still having problems:
The parse tree looks like this (same as before):
{:mc=>{:spid=>"17560"@0}, :bc=>{:spid=>"16699"@7}}
So I copied that parse tree, transmogrified it into a rule and just tried to get the transform rule to match:
rule({:mc=>{:spid=>simple(:argm)}, :bc=>{:spid=>simple(:argb)}}) { 'abc' }
Unfortunately, that still doesn't match and just passes through un-transformed.
And even more unfortunately, I will now have to have more transform rules, as I can get parse trees where :mc is already an array or :bc is already an array; like this...
{:mc=>{:spid=>"12345"@0}, :bc=>[{:pid=>"17560"@8}, {:pid=>"17899"@15}]}
It's the nature of the input I'm dealing with -- I want to get an array back out for both :mc and :bc, even they have only one element.
OK; figured it out...
I had already identified 'single' entries versus 'multiple' entries with the ":spid" versus ":pid" name. So, all that was required were the two rules:
rule(:pid => simple(:x)) { Integer(x) }
rule(:spid => simple(:x)) { [ Integer(x) ] }
It would be nice if it were possible to tell parslet that I always want something to be an array, even if there's only one element. The two rules below are extracted from the getting started example.
If
arglist
contains multiple expression, thenarglist
parsed in the context offuncall
will be an array. However, if it only contains one expression, then it will be a hash. This means additional logic to check whether one or many things were parsed.It would be nice if this could be handled a bit more automatically. One possibility is something like
This would check if parsing
arglist
results in only a single element and then wrap it in an array. It also keeps backwards compatibility. I think I'll probably start working on a patch, but let me know if this is something you would be interested in accepting or if there's a better way to do the same thing.