Open graphitemaster opened 3 years ago
This is odd, I run wren in valgrind from time to time, and the issue never showed up. Are you sure GCC is not a little bit optimistic at finding the error, and it is in fact impossible to happen?
It's correct, wrenCompile
starts like
Parser parser;
parser.vm = vm;
parser.module = module;
parser.source = source;
parser.tokenStart = source;
parser.currentChar = source;
parser.currentLine = 1;
parser.numParens = 0;
// Zero-init the current token. This will get copied to previous when
// nextToken() is called below.
parser.next.type = TOKEN_ERROR;
parser.next.start = source;
parser.next.length = 0;
parser.next.line = 0;
parser.next.value = UNDEFINED_VAL;
parser.printErrors = printErrors;
parser.hasError = false;
// Read the first token into next
nextToken(&parser);
Nowhere in here is parser.current
or parser.previous
initialized, the stack allocation of Parser parser;
leaves it uninitialized, the very first call to nextToken
then does, as the first two statements in the function
// Lex the next token and store it in [parser.next].
static void nextToken(Parser* parser)
{
parser->previous = parser->current;
parser->current = parser->next;
// [snip]
nextToken is called twice, which copies next -> current and current -> previous so both should be set but there's no particular reason I didn't initialize them. It'd make sense to do so cos the compiler can't see the order of init there and being explicit is better.
The problem is that reading an uninitialized memory is always UB in C, and thus when nextToken()
executes parser->previous = parser->current;
(but current
is uninitialized), it executes UB, allowing the compiler to remove this line or even the whole program (or format your hard drive).
I already said it's better to initialize them explicitly, which will happen soon, don't worry about your files on your PC until then (hyperbole doesn't make a point stronger in practice fwiw).
I just wanted to note it's not just better, but in fact required to conform the C standard. Sorry if I wasn't explicit enough.
Yeah, reading the uninitialized value invokes UB, it's why ThinLTO miscompiles the code and removes the assignments in nextToken
because the linker optimizations assume code does not invoke UB. There's a free license to remove the code that would / does invoke UB since it shouldn't happen. Basically doesn't matter if eventually it gets cleared out correctly, one read of an uninitialized value is all it takes with aggressive optimizations sadly. Not saying I agree with how the standard works here or anything. Just observing how and why, pedantry aside, it's a simple necessary fix in my case.
Fixed in #1012 :)
The parser fields
previous
andcurrent
are left in an uninitialized state whenwrenCompile
callsnextToken
. This is caught by-Wuninitialized
in gcc, but also breaks Wren when built with ThinLTO as the first two assignments innextToken
are removed from the generated code.This is an amalgamated build but the relevant warning can be seen here