elixir-lang / tree-sitter-elixir

Elixir grammar for tree-sitter
https://elixir-lang.org/tree-sitter-elixir
Apache License 2.0
249 stars 25 forks source link

parse nullary range operator #25

Closed the-mikedavis closed 2 years ago

the-mikedavis commented 2 years ago

connects https://github.com/elixir-lang/elixir/pull/11623

I think it makes sense for this to be a new node. Maybe it should be range_operator instead if we expect no other new nullary operators in the future?

josevalim commented 2 years ago

For now we have only a single nullary opearator and it is an existing operator, so you can take special provisions for it. Especially it doesn't need to be handled on .. + / because the range operator already is!

the-mikedavis commented 2 years ago

Sounds good :+1:

So then I think it makes sense to re-use operator_identifier and treat it as an expression. I added a note suggesting to refactor to 'nullary_operator' if there are more added in the future.

That conflict from above becomes... ``` Unresolved conflict for symbol sequence: '..' • '/' … Possible interpretations: 1: (_expression '..') • '/' … 2: (operator_identifier '..') • '/' … Possible resolutions: 1: Specify a higher precedence in `_expression` than in the other rules. 2: Specify a higher precedence in `operator_identifier` than in the other rules. 3: Add a conflict for these rules: `_expression`, `operator_identifier` ``` if we remove the added `prec/2` within the operator_identifier rule. Keeping that `prec/2`, we give preference to `../` being a capture of `operator_identifier` rather than division on nullary range.
jonatanklosko commented 2 years ago

@the-mikedavis thanks for looking into this so quickly! We need the nullary operator to have the proper precedence. Also, since we already parse it as operator_identifier, this itself should cover the ../2 case as division on expressions, where left expression is operator_identifier. So I think we can omit .. in the operator_identifier rule altogether. Here's my take:

diff --git a/grammar.js b/grammar.js
index 6bd3f61..f91bdbf 100644
--- a/grammar.js
+++ b/grammar.js
@@ -16,6 +16,7 @@ const PREC = {
   XOR_OP: 140,
   TERNARY_OP: 150,
   CONCAT_OPS: 160,
+  RANGE_OP: 160,
   ADD_OPS: 170,
   MULT_OPS: 180,
   POWER_OP: 190,
@@ -33,13 +34,13 @@ const COMP_OPS = ["==", "!=", "=~", "===", "!=="];
 const REL_OPS = ["<", ">", "<=", ">="];
 const ARROW_OPS = ["|>", "<<<", ">>>", "<<~", "~>>", "<~", "~>", "<~>", "<|>"];
 const IN_OPS = ["in", "not in"];
-const CONCAT_OPS = ["++", "--", "+++", "---", "..", "<>"];
+const CONCAT_OPS = ["++", "--", "+++", "---", "<>"];
 const ADD_OPS = ["+", "-"];
 const MULT_OPS = ["*", "/"];
 const UNARY_OPS = ["+", "-", "!", "^", "~~~", "not"];

 const ALL_OPS = [
-  ["->", "when", "::", "|", "=>", "&", "=", "^^^", "//", "**", ".", "@"],
+  ["->", "when", "::", "|", "=>", "&", "=", "^^^", "//", "..", "**", ".", "@"],
   IN_MATCH_OPS,
   OR_OPS,
   AND_OPS,
@@ -213,6 +214,7 @@ module.exports = grammar({
         $.tuple,
         $.bitstring,
         $.map,
+        $._nullary_operator,
         $.unary_operator,
         $.binary_operator,
         $.dot,
@@ -426,6 +428,11 @@ module.exports = grammar({
         )
       ),

+    _nullary_operator: ($) =>
+      // Nullary operators don't have any child nodes, so we reuse the
+      // operator_identifier node
+      alias(prec(PREC.RANGE_OP, ".."), $.operator_identifier),
+
     unary_operator: ($) =>
       choice(
         unaryOp($, prec, PREC.CAPTURE_OP, "&", $._capture_expression),
@@ -480,6 +487,7 @@ module.exports = grammar({
         binaryOp($, prec.left, PREC.XOR_OP, "^^^"),
         binaryOp($, prec.right, PREC.TERNARY_OP, "//"),
         binaryOp($, prec.right, PREC.CONCAT_OPS, choice(...CONCAT_OPS)),
+        binaryOp($, prec.right, PREC.RANGE_OP, ".."),
         binaryOp($, prec.left, PREC.ADD_OPS, choice(...ADD_OPS)),
         binaryOp($, prec.left, PREC.MULT_OPS, choice(...MULT_OPS)),
         binaryOp($, prec.left, PREC.POWER_OP, "**"),
@@ -524,6 +532,10 @@ module.exports = grammar({
         alias($._not_in, "not in"),
         "^^",
         ...CONCAT_OPS,
+        // The range operator has both a binary and a nullary version.
+        // The nullary version is already parsed as operator_identifier,
+        // so it covers this case
+        // ".."
         ...MULT_OPS,
         "**",
         "->",

wdyt?

the-mikedavis commented 2 years ago

Ah I really like that! Let's do that instead