jcoglan / canopy

A parser compiler for Java, JavaScript, Python, Ruby
http://canopy.jcoglan.com
Mozilla Public License 2.0
418 stars 54 forks source link

Allow Java actions to throw ParserError? #59

Open kristapsdz-saic opened 2 years ago

kristapsdz-saic commented 2 years ago

There's currently no way for Java actions to throw an error to signal failure, which has been problematic in a larger project I'm implementing with this great piece of software.

I've worked around this by patching the action invocations to throw. (Edited for the newest version.)

diff --git a/node_modules/canopy/lib/builders/java.js b/node_modules/canopy/lib/builders/java.js
index 9a446c4..bc32f0c 100644
--- a/node_modules/canopy/lib/builders/java.js
+++ b/node_modules/canopy/lib/builders/java.js
@@ -132,7 +132,7 @@ class Builder extends Base {

   method_ (name, args, block) {
     this._newline()
-    this._line('TreeNode ' + name + '() {', false)
+    this._line('TreeNode ' + name + '() throws ParseError {', false)
     this._scope(block)
     this._line('}', false)
   }
diff --git a/node_modules/canopy/templates/java/Actions.java b/node_modules/canopy/templates/java/Actions.java
index 74fd824..1f9aa07 100644
--- a/node_modules/canopy/templates/java/Actions.java
+++ b/node_modules/canopy/templates/java/Actions.java
@@ -2,6 +2,6 @@ import java.util.List;

 public interface Actions {
 {{#each actions}}
-    public TreeNode {{this}}(String input, int start, int end, List<TreeNode> elements);
+    public TreeNode {{this}}(String input, int start, int end, List<TreeNode> elements) throws ParseError;
 {{/each}}
 }

This has worked very well in having the actions throw nested errors, with the top-level wrapping in a ParseError to pass to the caller.

Since existing implements without the throw are more specific than the interface, this will not affect existing code.

If you'd like any other changes to merge a throwable into the mainline, please let me know what I can do!

jcoglan commented 2 years ago

This is a deliberate design decision. The definition of what is valid syntax is assumed to lie entirely in the grammar definition, and action functions are only responsible for building nodes based on that assumption.

If an action function throws an exception, that will make the entire parse fail, even if there are other choice branches the parser could have tried. Preventing this would require action functions to return a special value to indicate failure, whereas we currently assume all values, including null, are valid action values.

We _are_currently considering having a distinct mechanism for predicates, where users can inject code to decide whether the input is valid or not, but that would be a separate interface from action functions.