Open ekilmer opened 1 year ago
@maxbrunsfeld - if you have a bit of time to spare to look at this, I'd really appreciate it. This is completely baffling to me:
map<string, vector<vector<string>>> v;
produces:
(translation_unit [0, 0] - [1, 0]
(declaration [0, 0] - [0, 38]
type: (template_type [0, 0] - [0, 35]
name: (type_identifier [0, 0] - [0, 3])
arguments: (template_argument_list [0, 3] - [0, 35]
(identifier [0, 4] - [0, 10]) <-- this should be (type_descriptor (type_identifier))
(type_descriptor [0, 12] - [0, 34]
type: (template_type [0, 12] - [0, 34]
name: (type_identifier [0, 12] - [0, 18])
arguments: (template_argument_list [0, 18] - [0, 34]
(type_descriptor [0, 19] - [0, 33]
type: (template_type [0, 19] - [0, 33]
name: (type_identifier [0, 19] - [0, 25])
arguments: (template_argument_list [0, 25] - [0, 33]
(type_descriptor [0, 26] - [0, 32]
type: (type_identifier [0, 26] - [0, 32]))))))))))
declarator: (identifier [0, 36] - [0, 37])))
But, putting the same exact code as a field declaration makes it work:
struct Foo {
map<string, vector<vector<string>>> v;
};
(translation_unit [0, 0] - [3, 0]
(struct_specifier [0, 0] - [2, 1]
name: (type_identifier [0, 7] - [0, 10])
body: (field_declaration_list [0, 11] - [2, 1]
(field_declaration [1, 4] - [1, 42]
type: (template_type [1, 4] - [1, 39]
name: (type_identifier [1, 4] - [1, 7])
arguments: (template_argument_list [1, 7] - [1, 39]
(type_descriptor [1, 8] - [1, 14] <-- now it's correct
type: (type_identifier [1, 8] - [1, 14]))
(type_descriptor [1, 16] - [1, 38]
type: (template_type [1, 16] - [1, 38]
name: (type_identifier [1, 16] - [1, 22])
arguments: (template_argument_list [1, 22] - [1, 38]
(type_descriptor [1, 23] - [1, 37]
type: (template_type [1, 23] - [1, 37]
name: (type_identifier [1, 23] - [1, 29])
arguments: (template_argument_list [1, 29] - [1, 37]
(type_descriptor [1, 30] - [1, 36]
type: (type_identifier [1, 30] - [1, 36]))))))))))
declarator: (field_identifier [1, 40] - [1, 41])))))
The only difference is the grandparent node of the template_argument_list
(field_declaration
vs declaration
). Putting the code inside a function scope where it's still a normal declaration still produces the incorrect identifier
.
Here's the template_type
and template_argument_list
rules:
template_type: $ => seq(
field('name', $._type_identifier),
field('arguments', $.template_argument_list)
),
template_argument_list: $ => seq(
'<',
commaSep(choice(
prec.dynamic(3, $.type_descriptor),
prec.dynamic(2, alias($.type_parameter_pack_expansion, $.parameter_pack_expansion)),
prec.dynamic(1, $._expression)
)),
alias(token(prec(1, '>')), '>')
),
When the template_type
is in a declaration
, it's choosing $._expression
(which ends up as identifier
) instead of type_descriptor
, which also matches and should have higher precedence for the conflict. I produced the graphviz dot debug output for both, but was having a hard time figuring out why the two cases are reducing differently so this is a bit above my ability.
setting _expression
(well now _expression_not_binary
as that's the one w/ conflicts) to have a dynamic precedence of -1 fixes this, but causes issues elsewhere. maybe removing identifiers from the expressions in template argument lists is an option
I have a similar example, although it produces a hard error, not just a misclassified node:
optional<tuple<A<W>,B<X>,C<Y>,D<Z>>> f;
This produces:
(translation_unit [0, 0] - [2, 0]
(expression_statement [0, 0] - [0, 39]
(parameter_pack_expansion [0, 0] - [0, 38]
pattern: (binary_expression [0, 0] - [0, 38]
left: (template_function [0, 0] - [0, 35]
name: (identifier [0, 0] - [0, 8])
arguments: (template_argument_list [0, 8] - [0, 35]
(template_function [0, 9] - [0, 24]
name: (identifier [0, 9] - [0, 14])
arguments: (template_argument_list [0, 14] - [0, 24]
(type_descriptor [0, 15] - [0, 19]
type: (template_type [0, 15] - [0, 19]
name: (type_identifier [0, 15] - [0, 16])
arguments: (template_argument_list [0, 16] - [0, 19]
(type_descriptor [0, 17] - [0, 18]
type: (type_identifier [0, 17] - [0, 18])))))
(binary_expression [0, 20] - [0, 23]
left: (identifier [0, 20] - [0, 21])
right: (identifier [0, 22] - [0, 23]))))
(type_descriptor [0, 25] - [0, 29]
type: (template_type [0, 25] - [0, 29]
name: (type_identifier [0, 25] - [0, 26])
arguments: (template_argument_list [0, 26] - [0, 29]
(type_descriptor [0, 27] - [0, 28]
type: (type_identifier [0, 27] - [0, 28])))))
(template_function [0, 30] - [0, 34]
name: (identifier [0, 30] - [0, 31])
arguments: (template_argument_list [0, 31] - [0, 34]
(type_descriptor [0, 32] - [0, 33]
type: (type_identifier [0, 32] - [0, 33]))))))
right: (identifier [0, 37] - [0, 38])))))
test.hpp 0 ms (MISSING "..." [0, 38] - [0, 38])
It appears to give precedence to a parameter_pack_expansion
, then cannot find the ...
at the end. The level of nesting (optional<tuple<...>>
not just tuple<...>
) and number of template arguments (optional<tuple<A<W>,B<X>,C<Y>,D<Z>>>
not just optional<tuple<A<W>,B<X>,C<Y>>>
is necessary to trigger the error.
The change that @amaanq mentions above (to set the dynamic precedence of _expression
to -1 in the part of the code that @jdrouhard posted) removes the hard error, although now we're back to a misclassified node:
(translation_unit [0, 0] - [2, 0]
(declaration [0, 0] - [0, 39]
type: (template_type [0, 0] - [0, 36]
name: (type_identifier [0, 0] - [0, 8])
arguments: (template_argument_list [0, 8] - [0, 36]
(type_descriptor [0, 9] - [0, 35]
type: (template_type [0, 9] - [0, 35]
name: (type_identifier [0, 9] - [0, 14])
arguments: (template_argument_list [0, 14] - [0, 35]
(template_function [0, 15] - [0, 19] <<<<<<<<********* template_function, should be type_descriptor?
name: (identifier [0, 15] - [0, 16])
arguments: (template_argument_list [0, 16] - [0, 19]
(type_descriptor [0, 17] - [0, 18]
type: (type_identifier [0, 17] - [0, 18]))))
(type_descriptor [0, 20] - [0, 24]
type: (template_type [0, 20] - [0, 24]
name: (type_identifier [0, 20] - [0, 21])
arguments: (template_argument_list [0, 21] - [0, 24]
(type_descriptor [0, 22] - [0, 23]
type: (type_identifier [0, 22] - [0, 23])))))
(type_descriptor [0, 25] - [0, 29]
type: (template_type [0, 25] - [0, 29]
name: (type_identifier [0, 25] - [0, 26])
arguments: (template_argument_list [0, 26] - [0, 29]
(type_descriptor [0, 27] - [0, 28]
type: (type_identifier [0, 27] - [0, 28])))))
(type_descriptor [0, 30] - [0, 34]
type: (template_type [0, 30] - [0, 34]
name: (type_identifier [0, 30] - [0, 31])
arguments: (template_argument_list [0, 31] - [0, 34]
(type_descriptor [0, 32] - [0, 33]
type: (type_identifier [0, 32] - [0, 33]))))))))))
declarator: (identifier [0, 37] - [0, 38])))
Well, this is not an incorrect parse, that could indeed be a template_function
, but it seems inconsistent to prefer template_function
for the first template argument, but type_descriptor
for the others.
To have a shot in the dark, I wonder whether line 155 has something to do with the precedences resolving this way, but have not found an alternative that keeps other tests working:
type_descriptor: (_, original) => prec.right(original),
A further update: setting the precedence of _expression
to 0 rather than -1 produces the same output above (i.e. one of the arguments is template_function
), but now tree-sitter test
passes at least.
This example gives weird/inconsistent results leading me to believe there's an issue with parsing nested templates:
I expect to see all types highlighted, but the last line fails to highlight
string
in the first index tomap
; instead parsing it as anidentifier
when I expecttype_identifier
: