Closed kquick closed 1 year ago
I'm a bit surprised by this, since my reading of the LLVM .ll
pretty-printing code here is that if a DIDerivedType
defines a dwarfAddressSpace
, then it should always be printed. (LLVM's printInt
function normally does skip printing 0
values, but they explicitly pass false
here to override this behavior.)
In fact, I think the real bug lies in the way that we parse DIDerivedType
s in llvm-pretty-bc-parser
. My understanding of the LLVM bitcode parsing code here is that the bitcode doesn't directly store the DWARF address space value, but rather the value plus one. That is, the bitcode would store 1
for address space 0
, 2
for address space 1
, and so on. Moreover, the bitcode stores a 0
when there is no address space at all, which we represent as Nothing
on the llvm-pretty
side.
Looking at the code llvm-pretty-bc-parser
uses to parse DIDerivedType
s here, none of this is being accounted for. That feels like actual cause of the issue here, and if that were fixed, I suspect that the existing code in llvm-pretty
would be perfectly suited to pretty-print DIDerivedType
s.
Here's a separate reference from the llvm-project: https://github.com/llvm/llvm-project/blob/bbe8cd13335300958b04db5318c31ff52714f96f/llvm/lib/Bitcode/Reader/MetadataLoader.cpp#L1544-L1548
I guess the question is: how do we want to represent this at the AST level. There are really three states involved here: not specified, explicitly no Dwarf Address Space, and a Dwarf Address Space with a value (I don't think we should overload the first two via Nothing
). If we properly want to encompass all of that, we probably need to define a corresponding type with three constructors to place into the AST, and then handle that on both the printing and the parsing side. Currently, we represent that by allowing 0 to have special meaning, and expecting the user of the AST to know that and to know that an actual dwarf address space is (- 1)
. This PR fixes the pretty-printing for the latter case; I can add the triple-constructor datatype instead if you agree that is the right way to handle this, or do you think we should be overloading Nothing
?
There are really three states involved here: not specified, explicitly no Dwarf Address Space, and a Dwarf Address Space with a value
I interpreted the comment that you linked to differently. By my reading, there are only two possible states: either a DIDerivedType
has no DWARF address space, or it does. That is, 0
is a perfectly valid DWARF address space. The confusing part is how they're encoded in bitcode:
I suppose you could argue that prior to LLVM 5 (the version in which dwarfAddressSpace
was introduced), the bitcode doesn't say one way or the other about whether a DIDerivedType
has a DWARF address space. But I'd argue that it's fine to lump that in with the "no DWARF address space" case. That's because (1) the two cases would be handled identically in practically every conceivable way (including pretty-printing), and (2) I don't see any benefit in separating this out into a third case.
OK, canceling as superceded by https://github.com/GaloisInc/llvm-pretty-bc-parser/pull/253
The llvm project does not print this field if the value is zero, causing a difference in the output between this library and llvm, which causes the llvm-pretty-bc-parser round-trip testing to fail. This makes the pretty-printing in this library match the llvm behavior.