aheber / tree-sitter-sfapex

Tree-sitter implementation for Salesforce's Apex, SOQL, and SOSL
MIT License
70 stars 14 forks source link

locating children nodes in switch_label #51

Closed xixiaofinland closed 1 day ago

xixiaofinland commented 3 days ago

This is not a must-to-improve point, as it's better than "dml_expression" node which I submitted in another thread, the cause is again due to "unannotated_type" and "expressions" nodes mix.

my code is here I'm able to locate the variants, but need to loop through children. It works but diverges from other node handling patterns.

    switch_label: ($) =>
      seq(
        ci("when"),
        choice(
          // SObject type var syntax
          prec.right(
            commaJoined1(seq(optional($._unannotated_type), $.identifier))
          ),
          commaJoined1($.expression),
          ci("else")
        )
      ),
aheber commented 2 days ago

This actually brings up a couple of errors in my parser. I allow for comma-separated SObject pairs (when Account a, Contact c ...) but this isn't allowed. It must be one and only one SObject pair. I also have the SObject type as optional when I believe it should actually be required. (I couldn't get anything to execute in Salesforce that didn't include it) The definition should probably be seq($._unannotated_type, $.identifier)

A switch statement could be either SObject type and variable pair, a sequence of literals, a sequence of enum values, or an else keyword.

You already have handling for the else in the zero-length named children. My parser doesn't make a distinction for the literals and enums and it is actually very liberal in what it allows right now (any expression) which Apex definitely doesn't allow but I'd need to do some testing to trim it down better. (I wasn't worried I couldn't know how an Enum could be referenced so I left it wide open)

The sequence of expression statements is easy to comma join. It sounds like the SObject pair is the trouble. What if we moved that down a level so there was an when_sobject_pair which was defined as seq($._unannotated_type, $.identifier) so instead of two child nodes to inspect you would just have one and you'd need a formatter for the when_sobject_pair node type. If there is a child node and it is that type then format it directly, everything else goes through the general expression formatter until I can take the time to trim that down appropriately.

Thoughts?

xixiaofinland commented 1 day ago

Since the change could be bigger than expected I suggest we leave it as is. This isn't really a hassle as the "identifier" child node can easily differentiate the variants. We can come back in the future in case. What do you say?

aheber commented 1 day ago

I'm not particularly worried about the scope of this one. I built it up yesterday and it wasn't bad. I think this is a good improvement for the parser, not just to help printing.

If you can think of a reason it would harm things then I'll hold off.

This is what it would look like if I committed the change:

    switch_label: ($) =>
      seq(
        ci("when"),
        choice(
          // SObject type var syntax
          $.when_sobject_type,
          commaJoined1($.expression),
          ci("else")
        )
      ),

    when_sobject_type: ($) => seq($._unannotated_type, $.identifier),

This is what it is currently:

    switch_label: ($) =>
      seq(
        ci("when"),
        choice(
          // SObject type var syntax
          prec.right(
            commaJoined1(seq(optional($._unannotated_type), $.identifier))
          ),
          commaJoined1($.expression),
          ci("else")
        )
      ),
xixiaofinland commented 1 day ago

this would be a good improvement for me as well since it has the explicit node $.when_sobject_type, please go ahead :) @aheber

aheber commented 1 day ago

This is published now.

xixiaofinland commented 1 day ago

Is it so $.when_sobject_type is no longer in commaJoined1()? it seems to have changed. @aheber

aheber commented 1 day ago

Correct, that was an error in my configuration. I've tried to have multiple SObjects in a switch statement and it is rejected by Salesforce. I've also remove the optional SObject type.

From my comment above:

This actually brings up a couple of errors in my parser. I allow for comma-separated SObject pairs (when Account a, Contact c ...) but this isn't allowed. It must be one and only one SObject pair. I also have the SObject type as optional when I believe it should actually be required. (I couldn't get anything to execute in Salesforce that didn't include it) The definition should probably be seq($._unannotated_type, $.identifier)

If you've seen functional code that disproves this please let me know. I ran some experiments yesterday and wasn't able find anything valid that include two comma-joined SObject statements or where the SObject type was optional.

xixiaofinland commented 1 day ago

At least all current tests pass. These are rare conditions I don't have good test code coverage either. Time will tell :)

aheber commented 1 day ago

Thanks!