GaloisInc / llvm-pretty

An llvm pretty printer inspired by the haskell llvm binding
Other
28 stars 15 forks source link

Int32 used in places where it seems inappropriate #73

Closed robdockins closed 4 years ago

robdockins commented 4 years ago

For example, in the definition of types, Int32 is used to describe the length of arrays. clang is happy to compile code that allocates arrays with more than 2^31 -1 elements, and then then the llvm-pretty-bc-parser builds a type with a negative array length.

It seems clear that these should be Word32 at least; but maybe they should just be Natural instead. Is there some upper limit on these sizes arising from the LLVM bitcode format itself?

elliottt commented 4 years ago

I don't believe there's an upper limit in LLVM -- changing this to a larger type seems totally reasonable 👍

elliottt commented 4 years ago

For reference, LLVM uses uint64_t for the number of elements in a sequential type. I think it would be totally reasonable to just change this to Int64. Do you want to make that change? I'd be happy to review it.

robdockins commented 4 years ago

Yeah, I'm working on some changes in conjunction with llvm-pretty-bc-parser to figure out what should be done.

One curious thing I've noticed... character data seems like it is sometimes being stored using 10-bit fields for each character, even though I haven't seen anything outside the ASCII range. Any idea why that might be happening?

elliottt commented 4 years ago

I can't recall exactly what's happening there, but I remember there being a lot of weird corner cases around the encoding. Character data showing up as 10-bit values seems totally in-line with what I would expect from the bitcode format.

robdockins commented 4 years ago

CF #74

robdockins commented 4 years ago

Closed via #74