Closed RedNeath closed 2 months ago
Here is a language definition that should allow validation and token splitting of a formula:
formula -> '=' expression
expression -> ({operator} operand {operator} | spechar operand spechar) {expression}
operand -> expression | variable | function | value
function -> function_name '(' function_args ')'
function_args -> expression {',' function_args}
function_name -> 'SUM' | 'RANGE' | 'ABS' | 'AVG' ...
operator -> '+' | '-' | '=' | '%' | ':' | '&' ...
spechar -> '(' | ')' | ''' | '[' | ']' ...
value -> NUMBER | STRING | DATE ...
variable -> STRING (defined in the context)
Descending recursion should be used when parsing the formula, in order to handle that language.
Language should follow this grammar:
grammar ExcelFormulaTest;
formula: '=' expression;
expression:
variable
| value
| function
| '(' expression ')'
| '-' expression
| expression '%'
| expression '^' expression
| expression ('*' | '/') expression
| expression ('+' | '-') expression
| expression '&' expression
| expression comparison expression
;
comparison:
'='
| '<'
| '>'
| '<='
| '>='
| '<>'
;
function:
function_name '(' expression (',' expression)* ')'
;
variable:
'A1'
| 'B1'
| 'C1'
| 'A2'
| 'A3'
;
value:
NUMBER
| BOOLEAN
| STRING
;
function_name:
'ABS'
| 'AVG'
| 'IF'
;
NAME: [a-zA-Z0-9_]+;
STRING: '"' ('""' | '\r' | '\n' | '\r\n' | '_' | ~'"')*? '"';
NUMBER: [0-9]+ ('.' [0-9]+)?;
BOOLEAN: 'TRUE' | 'FALSE';
WS: [ \t\r\n]+ -> skip;
NEWLINE: [\r\n]+ -> skip;
The formula parser is the piece of code that will rip apart the different tokens of a given formula, and create the appropriate calculation tree.
Its job will be divided in 2 parts detailed below.
1 - Token recognition
Based on the lists of all operators, all functions and the input context (specifically the identifiers of the input context), the formula will be separated as tokens, using pattern recognition.
In that part, if the token recognition doesn't succeed, that means the given formula is incorrect, and an error must therefore be thrown at the user.
2 - Priority definition
Then, with the help of the priority level of each operator and function, the calculation tree will be put up in memory, and each node will correspond to one of the previously parsed tokens.
This part should not fail in any case, as it doesn't depend on the user input.