Closed stadelmanma closed 3 years ago
It sounds like the keyword extraction logic might fix this problem but it doesn't look like it will work since we don't have any string literals, only regexp due to matching upper and lower case for each letter.
UPDATE: Didn't appear to work unless regexp keyword extraction can be supported but isn't yet.
Relevant rules:
number_literal: $ => token(
choice(
// integer, real with and without exponential notation
/(((\d*\.)?\d+)|(\d+(\.\d*)?))([eEdD][-+]?\d+)?(_[a-zA-Z_]+)?/,
// binary literal
/[bB]'[01]+'/,
/'[01]+'[bB]/,
/[bB]"[01]+"/,
/"[01]+"[bB]/,
// octal literal
/[oO]'[0-8]+'/,
/'[0-8]+'[oO]/,
/[oO]"[0-8]+"/,
/"[0-8]+"[oO]/,
// hexcadecimal
/[zZ]'[0-9a-fA-F]+'/,
/'[0-9a-fA-F]+'[zZ]/,
/[zZ]"[0-9a-fA-F]+"/,
/"[0-9a-fA-F]+"[zZ]/
)),
logical_expression: $ => {
const table = [
[caseInsensitive('\\.or\\.'), PREC.LOGICAL_OR],
[caseInsensitive('\\.and\\.'), PREC.LOGICAL_AND],
[caseInsensitive('\\.eqv\\.'), PREC.LOGICAL_EQUIV],
[caseInsensitive('\\.neqv\\.'), PREC.LOGICAL_EQUIV]
]
return choice(...table.map(([operator, precedence]) => {
return prec.left(precedence, seq(
field('left', $._expression),
field('operator', operator),
field('right', $._expression)
))
}).concat(
[prec.left(PREC.LOGICAL_NOT, seq(caseInsensitive('\\.not\\.'), $._expression))])
)
},
relational_expression: $ => {
const operators = [
'<',
caseInsensitive('\\.lt\\.'),
'>',
caseInsensitive('\\.gt\\.'),
'<=',
caseInsensitive('\\.le\\.'),
'>=',
caseInsensitive('\\.ge\\.'),
'==',
caseInsensitive('\\.eq\\.'),
'/=',
caseInsensitive('\\.ne\\.')
]
return choice(...operators.map((operator) => {
return prec.left(PREC.RELATIONAL, seq(
field('left', $._expression),
field('operator', operator),
field('right', $._expression)
))
}))
},
_expression: $ => choice(
$.number_literal,
$.complex_literal,
$.string_literal,
$.boolean_literal,
$.array_literal,
$.identifier,
$.derived_type_member_expression,
$.logical_expression,
$.relational_expression,
$.concatenation_expression,
$.math_expression,
$.unary_expression,
$.parenthesized_expression,
$.call_expression
// $.implied_do_loop_expression // https://pages.mtu.edu/~shene/COURSES/cs201/NOTES/chap08/io.html
),
@maxbrunsfeld I can't seem to get this "conflict" fixed, I've copied and pasted the relevant rules above as well as a code snippet that displays the errors. I was hoping you might have some insights given your much greater experience in this arena.
Changing my main number literal rule to /\d+(\.\d+)?([eEdD][-+]?\d+)?(_[a-zA-Z_]+)?/
fixes the problem but unfortunately 1.
and .1
are valid number literals in Fortran and that doesn't get matched by that regexp.
This rule fixes the parsing but breaks number literals of the starting in the following form: 1.
and .1
:
/\d+(\.\d+)?([eEdD][-+]?\d+)?(_[a-zA-Z_]+)?/
Basically I need to only allow a dangling decimal place marker for numbers when the literal is not followed by a letter that isn't [eEdD]
which are used for exponential notation.
Since lookahead and lookbehind aren't supported I think I need to do this via the external scanner.
This exprssion does not parse correctly,
if(ix.ge.1.and.ix.le.nx) x = 1
. The first number literal gets gathered as "1." instead of "1" causing the ".and." operator to look like "and." which isn't a valid token. Even the GitHub highlighting seems to get it wrong as well.Demo code:
Expected Output:
Actual Output: