Open stebanos opened 6 years ago
@stebanos I tried adding the line numbers in the phraseBook
itself and now it looks something like this-
lineNo
corresponds the line number corresponding to the nodes root, giver, object
and line
is the line number of the further sub-nodes they have.
Can this be a start to the potential solution to this issue?
Yes, it's a good start! But it's more consistent to call them either line
or lineNo
.
Ohkay! I will continue working on these lines then.
@stebanos Well, I guess I didn't think this through properly.
The classes Lexer
and Parse
do not yet have access to phraseBook
. Instead they are required to make phraseBook
. So storing the line numbers in phraseBook
is a bad idea I guess.
Instead maybe I can modify parsePhrase()
function so as to give the classes access to the lines?
Not sure. The Parser might need access to the line number, the Lexer not really I think.
@stebanos You can have a look on #42. I provided Lexer
with line numbers in it, though. If it seems unnecessary, I can pass it directly to Parser
@stebanos For fixing the additional two positions in error messages, can we use a parameter in the Lexer
class to indicate that the line being parsed should start its position counter from 2 instead of 0 for lines starting with -
?
That won't be sufficient because a phrase can span multiple lines, and the Lexer isn't aware if it does.
I don't understand what you mean by a phrase spanning multiple lines. Could you please elaborate?
Let me illustrate by an example:
root:
- <h1>Here's an example:</h1>
<p>
Dear {{ giver }p}, thank you for the {{ object }}.
</p>
- Hey {{ giver }}, thank you for the {{ object }}.
root
has two phrases. The first one spanning 4 lines from <h1>
up until </p>
but I made an error on the 4th line (which is the third line of our grouped phrase): {{ giver }p}
. When I run the example I receive the following error:
Line 2: Invalid character } at position 46 in phrase '<h1>Here's an example:</h1><p> Dear {{ giver }p}, thank you for the {{ object }}.</p>'.
Not only is the reported line number wrong (it should be line 4, not line 2), but since it's all grouped the position gets wrong as well (it should be 19 and not 46).
Oh.. Okay. Now I get it.
Considering this, may be we can add a key in the elements of the phrases
array which stores the number of characters after which the lines break, as an array, which is then detected in the Lexer
and the line numbers in the error messages are changed accordingly.
For example, if we consider your illustration, then we can add this key to the object corresponding to root
, stored by the phrases
array
lineBreaks: [27, 3, 53, 4]; //number of characters after which the lines break in root
So the error message should be changed to have the line number as 46 - (27 + 3) = 16
because 46 > (27 + 3) && 46 < (27 + 3 + 53)
Is this a viable solution?
It could be, and it's one that crossed my mind before, but I wonder if there's a more elegant solution. I think it would be more beneficial if each parsed token becomes aware of it's line number and position within that line.
Yes, that would be a better solution as it would provide easy access to the line numbers, which could be helpful for future additional features. So if we are breaking down the phrase into parts depending on the line numbers, we would need a kind of a identifier that helps indicate that all of them belong to the same phrase.
Right now only errors on the global structure of the source code communicate their line numbers. Expand this to the parsing and runtime level.
Also I noticed the current position numbers mentioned in errors are always off by at least 2 positions (due to the the first 2 characters of a line as the necessary indentation).