stadelmanma / tree-sitter-fortran

Fortran grammar for tree-sitter
MIT License
30 stars 15 forks source link

Parse variable declarations with dimension size differently #86

Open ZedThree opened 1 year ago

ZedThree commented 1 year ago

Currently variable declarations like the following:

INTEGER :: l(*)

get parsed like as:

    (variable_declaration (intrinsic_type)
      (call_expression (identifier) (argument_list (assumed_size))))

I'm not keen on the call_expression there as I think we should be able to always distinguish between declarations and function calls in this context.

A small tweak to the grammar can give:

    (variable_declaration (intrinsic_type)
      (identifier) (size (argument_list (assumed_size))))

I'm not completely convinced by this -- maybe it would be better to have another node? And/or drop the argument_list node?

    (variable_declaration (intrinsic_type)
      (variable      
        (identifier) 
        (size (assumed_size))))
stadelmanma commented 1 year ago

I think dropping the call expression makes sense based on your reasoning. I’m less concerned about argument_list since it’s sort of an accurate description but your final example output seems the cleanest. I don’t think we need another node?

ZedThree commented 1 year ago

I think the extra node here would actually be useful in order to have the (size) node be a child.

For example:

integer :: a(n), b

gets parsed as:

(variable_declaration
    (intrinsic_type)
    (identifier)
    (size (identifier))
    (identifier))

Here the variable names and size nodes are at the same level, which I think will make parsing the tree awkward.

Instead, I think this would be better:

(variable_declaration
    (intrinsic_type)
    (variable
        name: (identifier)
        (size (identifier)))
    (variable
        name: (identifier)))

Then it's clearer which size goes with which identifier.