sunfishcode / llvm2cranelift

LLVM IR to Cranelift IR translator
Apache License 2.0
36 stars 4 forks source link

Translation is lossy WRT names #7

Closed djg closed 6 years ago

djg commented 6 years ago

The following LLVM IR:

define void @foo(i32 %arg, i32 %arg1) {
bb:
  %tmp = icmp ult i32 %arg, 4
  %tmp2 = icmp eq i32 %arg1, 0
  %tmp3 = or i1 %tmp2, %tmp
  br i1 %tmp3, label %bb4, label %bb6

bb4:
  %tmp5 = call i32 @bar()
  ret void

bb6:
  ret void
}

declare i32 @bar()

Is translated into CTON IR:

function u0:0(i32, i32) native {
    sig0 = () -> i32 native
    fn0 = sig0 u0:1

ebb1(v0: i32, v1: i32):
    v2 = icmp_imm ult v0, 4
    v3 = icmp_imm eq v1, 0
    v4 = bor v3, v2
    brz v4, ebb0
    v5 = call fn0()
    return

ebb0:
    return
}

foo and bar are changed to u0:0 and u0:1. Given the example in the Cretonne docs, the following should be generated:

function %foo(i32, i32) native {
    fn0 = function %bar() -> i32 native

ebb1(v0: i32, v1: i32):
    v2 = icmp_imm ult v0, 4
    v3 = icmp_imm eq v1, 0
    v4 = bor v3, v2
    brz v4, ebb0
    v5 = call fn0()
    return

ebb0:
    return
}

This appears to be an issue with Cretonne. ExternalName doesn't provide anyway to look up user-defined names.

sunfishcode commented 6 years ago

Ah, I need to update the example. Cretonne's ExternalName changed from holding actual byte sequences to holding opaque indices; this is what necessitated llvm2cretonne's string_table.rs. So this means that the Cretonne IR dump won't contain the actual symbol name anymore.

That said, it would be nice to print the name somewhere. Perhaps we could have the llvm2cretonne code that prints the function print a comment before the function.

djg commented 6 years ago

@sunfishcode I started adding an option to cretonne to pass through a closure that is called to resolve ExternalName into a proper name. It works, but from reading the code I'm not sure what the requirements governing cretonne names are?

The docs appear to suggest they're of the %string format but the code calls these testcase strings and limits them to 16 characters.

Surely we need to keep the name if it's an external symbol, such as function bar, that the linker needs to lookup?

sunfishcode commented 6 years ago

The design principle behind this is that the Cretonne codegen crate is meant to be just about codegen, and not linking, so it doesn't need to know anything about symbol names. As such, it's llvm2cretonne's job to keep track of the actual name strings, and pass them to faerie at the appropriate time.

djg commented 6 years ago

Thanks for the clarification.

sunfishcode commented 6 years ago

With https://github.com/sunfishcode/llvm2cretonne/commit/3ca2d2f810d5982b49493afee9f4959f44f84379, llvm2cretonne -p now prints the function names again, and the README is updated.