lloyddewit / RInsight

Creates an object model which represents a valid R script.
GNU General Public License v3.0
0 stars 0 forks source link

Token tree potentially incorrect if keywords used within arithmentic expressions #12

Open lloyddewit opened 9 months ago

lloyddewit commented 9 months ago

The test case below passes but the internal token tree that it generates is incorrect.

100/if(0)10 else 20+30

It should return 2 (100/(20+30)). But the token tree structures the expression as (100/20)+30 which would give 35. In practice, it makes no difference in R-Instat because R-Instat does not use the token tree directly and passes the whole statement to R. However, it should still be fixed in case the token tree is used for other purposes in the future.

The problem is caused because when RInsight processes the / it needs to make everything to the right (until the end of the if statement) the right operand of the /. It relies on the tree structure to do this. But the + is processed after the / so the tree structure is not yet complete, so RInsight does not recognise the + as part of the else statement. It therefore does not include the + as part of the if statement.

The solution is to first identify all the ends of statements before building the token tree.

lloyddewit commented 8 months ago

Note that the lines below were removed from the RToken constructor in #13. These checks do not account for key words. We may need to add similar checks again before we start to build the token tree. As we add tokens to the flat list, we may need to check if the tokens so far already make a valid statement.

            if (!statementContainsElement
                || statementHasOpenBrackets
                || lexemePrev.IsOperatorUserDefined 
                || (lexemePrev.IsOperatorReserved && lexemePrev.Text != "~"))
            {
                TokenType = TokenTypes.RNewLine;
            }
            else
            {
                TokenType = TokenTypes.REndStatement;
            }