Closed bcoppens closed 1 month ago
I think printGEPExpression
is just very naive about how type casts in C work, so it needs someone to teach it when "more idiomatic" code still needs casts and how to correctly print casts for "less idiomatic" code also
Hi, I've touched this function so I'm probably partly responsible for this mess.
It was written for when LLVM didn't have opaque pointers, so it could assume that the base pointer had the right type. Now that LLVM has opaque pointers, it will need to emit a cast of the base pointer always.
Producing readable and correct C code is very hard within a single-pass translator, and I gave up on it in the end. To get good C code out of a C backend, it would need to build some kind of IR or AST, not print strings directly.
I have some issues with the
else
branch ofif (!isConstantNull(FirstOp))
inCWriter::printGEPExpression
.struct A { std::string s; A() {} };
int main() { A a; return 6; }
However, with clang++/LLVM17, we get something like:
So the code with clang++/LLVM17 immediately accesses
%a
as if it were abasic_string
without bothering to cast the%struct.A %a
to it. Which results in:Which results in
Not sure what is going on there,
struct l_unnamed_2 field9
isWhich indeed doesn't have an array (but its
field0
does). This comes fromb
'sstruct l_unnamed_1
. However,struct l_struct_struct_OC_B
, which is what the GEP really does use as a type, does have a correctfield9
(struct l_array_256_uint8_t field9
).Now, trying to debug this in
CWriter::printGEPExpression
, theelse
branch ofif (!isConstantNull(FirstOp))
seems to be (based on the comments) only to print 'more idiomatic' C code.If I just always take the true branch (and thus less idiomatic code is generated), all the above test cases seem to work fine. (The first and second issue because that branch just always first casts to the correct base type of the GEP itself, the third issue I don't quite know why at the moment, but then again I cannot really reproduce it unfortunately.)
Given that that entire piece of code is somewhat underdocumented, I was going to propose just dropping the printing of more idiomatic code. Unfortunately, doing so makes test_empty_array_geps and test_empty_array_geps_struct fail (with errors such as 'dereferencing ‘void *’ pointer' and 'error: request for member ‘field0’ in something not a structure or union', so clearly the else branch is not just for making it look prettier.
So before trying to look into it myself, I was wondering if anyone else would perhaps immediately know what could cause this?