ManageIQ / floe

Floe is a runner for Amazon States Language workflows
Apache License 2.0
0 stars 5 forks source link

InstrinsicFunction foundation + States.Array, States.UUID #194

Closed Fryguy closed 2 months ago

Fryguy commented 3 months ago

This is the start of a PR for implementing Intrinsic Functions (re: #64) parsing with a parser generator.

Intrinsic functions are effectively raw Strings, and thus are their own "language" within that string. As such, we need a proper parser. This PR uses the parslet gem , which is a super simple pure-Ruby parser generator. Some other choices we can use are rexical+racc, lrama, ragel, TreeTop. However, I chose parslet because it was super-simple to use, very fast, no dependencies, and pure Ruby. I figure we can always change later once the parse rules are written, because conceptually they shouldn't change between gems (aside from syntax, of course), and the test suite is really the important part and should stay the same through any rewrites.

So far I've only implemented States.UUID for something super-simple, and States.Array for something relatively complex. I will add the rest in a follow-up.


A quick primer on parslet. (docs)

Parselet breaks the parser routine into 2 parts. The first part is the Parslet::Parser, where you define rules in Parsing Expression Grammar style. The output of the parser is a tree. However, what's cool with parslet is that within the parser you annotate what you want to appear in the tree, so you can pull out the important semantic stuff, and ignore the syntax stuff. This is done using the .as(symbol) function, where the symbol is what will be the key in the tree. Here is an example tree output from the parser:

payload = "States.Array('string', 1, 1.5, true, false, null, $.input)"
tree = Floe::Workflow::IntrinsicFunction::Parser.new.parse(payload)
pp tree
# {:states_array=>
#   [{:string=>"string"@14},
#    {:number=>"1"@23},
#    {:number=>"1.5"@26},
#    {:true_literal=>"true"@31},
#    {:false_literal=>"false"@37},
#    {:null_literal=>"null"@44},
#    {:jsonpath=>"$.input"@50}]}

The second part is the Parslet::Transform, where you define rules on how to convert that tree into something usable. It does so in a depth-first pattern-matching style, so you define transform rules based on the key names above, and then tell it how to process the values. Here is an example of feeding the tree into the transformer:

Floe::Workflow::IntrinsicFunction::Transformer.new.apply(tree, :input => {"input" => {"foo" => "bar"}})
# => ["string", 1, 1.5, true, false, nil, {"foo"=>"bar"}]

In this PR, I just put those together into a simple Floe::Workflow::IntrinsicFunction.evaluate method.

agrare commented 2 months ago

Okay I think all that is left is if you want to use Floe.logger for debugging the tree or dropping it for now

Fryguy commented 2 months ago

@agrare Updated to use Floe.logger. Example:

$ bin/console
irb(main):001:0> Floe.logger.level = 0
=> 0
irb(main):002:0> Floe::Workflow::IntrinsicFunction.evaluate("States.Array('foo', 1, 1.0, true, false, null, $.input)", {}, {"input" => {"foo" => "bar"}})
D, [2024-07-02T10:59:07.361965 #55278] DEBUG -- : Parsed intrinsic function: "States.Array('foo', 1, 1.0, true, false, null, $.input)" => {:states_array=>[{:string=>"foo"@14}, {:number=>"1"@20}, {:number=>"1.0"@23}, {:true_literal=>"true"@28}, {:false_literal=>"false"@34}, {:null_literal=>"null"@41}, {:jsonpath=>"$.input"@47}]}
=> ["foo", 1, 1.0, true, false, nil, {"foo"=>"bar"}]
miq-bot commented 2 months ago

Checked commits https://github.com/Fryguy/floe/compare/7b34b5a5dc3e11ecda0b63218231c63c3fb074ee~...3434c604ef66bdc86147591d31ab237b91267163 with ruby 3.1.5, rubocop 1.56.3, haml-lint 0.51.0, and yamllint 6 files checked, 0 offenses detected Everything looks fine. :trophy:

Fryguy commented 2 months ago

@agrare One other thing I noticed is that in a failed parse, it will raise a Parslet::ParseFailed error. Do you want me to trap and log that and/or wrap in a Floe-specific exception? Or can I wait for a separate PR for that?

agrare commented 2 months ago

Oh yeah, if it is an invalid payload definition we can raise a Floe::InvalidWorkflowError error if it is an invalid arg or something like that we can return a States.IntrinsicFailure error payload

kbrock commented 2 months ago

Is there a way to ask if this is a valid string at initial parsing time?

agrare commented 2 months ago

Yeah we were just discussing how ideally we'd separate "is the payload definition valid" vs "was it a runtime / argument error"

Fryguy commented 2 months ago

Is there a way to ask if this is a valid string at initial parsing time?

It depends how complicated we want to make this, but the parser and the transformer are separated, so that could be where the logical syntax/semantics boundary lives. In my mind that's where the separation of Floe::InvalidWorkflowError and States.IntrinsicFailure would be.

However, a good example of where the lines get blurred is that States.UUID is defined to not take any parameters, so if someone passes States.UUID(1) is that a syntax error or a semantic error? Right now it's a syntax error, but if we want it to be a semantic error, we'd have to changed the syntax definition in that it can take parameters, but then later fail with a semantic error if any parameters are given. 🤷