Closed intjftw closed 1 year ago
There are actually two kinds of "false positives" illustrated on the screenshot in the issue. As it turns out, they are not false at all in the sense that they do represent existing variables. However, their metadata stored in the database is incorrect for different reasons:
The first kind of "fake" entries is illustrated by the first 5 empty entries on the screenshot. These entries typically have a valid (non-negative) line number, but those are also usually higher than any lines in the body of the function itself. These are caused by local variables expanded from macros (such as LOG). My guess is that their locations are messed up because they reflect the location of the variable after all preprocessor definitions (excluding #include-s at the top) have already been expanded. In fact, looking at other similar functions in the parsed project, I noticed that the amount of such empty entries in the Info Tree directly correlates with the amount of LOG macro occurrences in the parsed function body. Also, parsed is the keyword here: macros inside lambdas, for example, do not seem to be parsed (#583 might be relevant here), so they don't create such entries either. This might also explain why some variables don't show up at all. Since the LOG macro expands to a for loop that has an extra local variable defined in its loop initializer statement, that's one extra empty entry per LOG occurrence every time.
By adding some dummy functions to the code, I was able to reproduce this behavior with arbitrarily many variables in a macro:
The second kind of "fake" entries is illustrated by the ones located at line "-1". These are actually implicit variables defined by the compiler inside the function body. Range-based for loops typically create 3 such variables (per loop). If i had to take a guess: the entry that shows the range expression in the Info Tree is the storage variable for its result, while the other two empty entries are the corresponding begin and end iterators. So far I've only noticed the compiler's use of such implicit variables in range-based for loops, but it's certainly possible that other language features can produce similar ones too.
By adding N for-loops, I was able to get the 3*N extra entries:
As to how we should handle these cases (in my opinion): Local variables expanded from macros do have a place in the local variables section. Maybe we could add a special flag to them so that they do appear in the info tree, but are grayed out (or something similar) to indicate that they weren't explicitly typed in by the author. Or we could put them in a separate section for implicit/preprocessor symbols if graying out is not an option (I do not know the extent to which we want/need to delve into preprocessor tokens). As for the compiler-generated variables: Since these might be compiler-specific, they are an implementation detail in the compiler itself rather than the parsed code base. The user might not benefit from knowing about such things when examining their code. We should either discard these implicit variables completely, or put them in a special section on the UI for compiler-generated symbols. But I might be wrong in my judgement.
Summary of today's meeting on this topic:
For implicit variables, two approaches were mentioned:
For macro expansions: Pointing to the immediate place of the macro's expansion is ok, but we should look around for how other IDEs do it.
In the C/C++ info tree, the local variables of a method are either incorrectly listed (several false or empty entries), or not listed at all. We should investigate if this is a problem in the web UI or in the backend.