Closed stbischof closed 1 year ago
Well, actually, there is no potential NPE there. But I still changed the code a little as a reaction to this, but first I'll tell you why the warning is spurious.
In this condition: (position - start > matchLength || returnedType.ordinal() < matchedType.ordinal())
there is an LHS and an RHS. If the LHS is true, then there is no need to evaluate the RHS, right? Or, in other words, we will only potentially hit the NPE if the LHS is false, i.e. position - start
is not greater than matchedLength
.
BUT... for matchedType
to be null, matchedLength
must be zero. (You can look carefully at the code and convince yourself of this. In fact, matchedType
being null and matchedLength
being zero amount to the same thing!) So, if matchedLength
is zero, then the LHS condition must be true, because position - start
is at least 1 at this point (it's the number of characters we have read in to match the returnedType
candidate. Or, in other words, to go to the RHS, matchedLength
must be greater than 0, which in turn means that matchedToken
is necessarily non-null.
So, there is no problem with the code, as it is (orwas) but it is understandable why the checker "thinks" there is (or might be...). But finally, I decided to get rid of the warning by making it more clear that matchedToken
cannot be null. So, I changed line 184 from:
TokenType matchedToken = null;
to:
TokenType matchedToken = TokenType.INVALID;
This, by the way, required a change on line 256, from:
if (matchedToken == null) {
to:
if (matchedToken == TokenType.INVALID) {
It actually is all the same really, but the intent of the code is clearer. I would point out, by the way, that the code line you are bringing up is actually very core logic in terms of the tokenization algorithm. If you run through the active NFA states and you have a newer TokenType candidate, that replaces the one you already matched, matchedType
, if it matched more characters than the existing candidate, i.e. position - start > matchedLength
OR its "ordinal" is lower, i.e. it is the same length but has higher priority. Got it?
So, you see, it's not really so hard to understand finally, right?
This code, on the other hand, is pretty hard to understand! I did come to some understanding of it at some point, but that was so you wouldn't have to! (You should thank me for that... LOL)
where.
https://github.com/congo-cc/congo-parser-generator/blob/08708399b324504961bb1a55fcee58f31e5a1d2c/src/templates/java/Lexer.java.ftl#L213
why: matchedType is not set