Closed philipturnbull closed 5 years ago
I wonder if things would be easier if that buffer were declared as ‘unsigned char *’.
I think it would make things easier. I'm not sure how much effort it would take to change the ABI and update all the grammars though
I'm wondering; could we just independently change the declared type in this scanner (and potentially other scanners) without necessarily changing the library itself?
The runtime never dereferences these pointers itself: it just passes a pointer to serialize
, memcpys
the data into the syntax tree, and then later passes these pointers to deserialize
. So I'd think it might be ok for the scanner to just interpret the pointer as an unsigned char *
across the board, even if we don't change the core library.
@philipturnbull You probably have a better understanding than I do of the semantics of that sort of thing; is there anything undefined or implementation-defined about declaring a function's parameter as pointing to two differently-signed types in two different translation units?
I think it's ~implementation defined~ undefined behaviour but, as the two types have the same underlying size, it should just work ™️
buffer[i++]
is asigned char
so this cast does not correctly handle negative values. If a custom tag has a length greater than 128 thenbuffer[i++]
is negative and so the(uint16_t)
cast will cast it to a large unsigned integer. This causes an out-of-bound read when reading the tag name.We need to cast
name_length
to auint8_t
first, then widen to auint16_t
./cc @anglinb who helped triage this ✨