llir / llvm

Library for interacting with LLVM IR in pure Go.
https://llir.github.io/document/
BSD Zero Clause License
1.19k stars 78 forks source link

reject non-pointer type for global #208

Closed dannypsnl closed 2 years ago

dannypsnl commented 2 years ago

If we define a new global

module.NewGlobal(name, types.I8)

it would cause an error that global variable reference must have pointer type

Thus, I think this check is acceptable.

mewmew commented 2 years ago

Definitely!

This should be part of an irsem package, that is yet to be written :) Along with other semantic analysis validation and error checking.

Cheers, Robin

mewmew commented 2 years ago

Just to clarify, what tool is reporting the error it would cause an error that global variable reference must have pointer type? Is this the LLVM compiler?

dannypsnl commented 2 years ago

yes

dannypsnl commented 2 years ago

BTW, do you think https://github.com/dannypsnl/extend is a good way to do irsem?

mewmew commented 2 years ago

BTW, do you think https://github.com/dannypsnl/extend is a good way to do irsem?

I consider dannypsnl/extend to be closer to irutil/irgen.

For irsem we would do a traversal over an entire module, validating it's semantics, e.g. by calling irsem.Verify.

ref: https://github.com/llir/llvm/issues/65#issuecomment-464938676

// Package sem performs semantic analysis of LLVM IR modules.
package sem

// Verify performs type-checking and semantic analysis of the given module to
// verify consistency and correctness.
func Verify(m *ir.Module) error

Edit: for inspiration, here is a very primitive semantic analysis written for the uC programming language by @sangisos and me. uc/sem. It essentially traverses the AST tree, checking that various invariants hold, that the type checking gives no type errors, and that the semantics are not broken in obvious ways.

dannypsnl commented 2 years ago

https://github.com/llir/irsem solution

mewmew commented 2 years ago

re: https://github.com/llir/irsem/blob/cda51dc46b55de911196e57f5848c5c828edf1a9/verify.go#L11

@dannypsnl, are you sure this semantic is not valid in LLVM IR? I think it is.

package main

import (
    "fmt"
    "log"

    "github.com/llir/llvm/ir"
    "github.com/llir/llvm/ir/constant"
    "github.com/llir/llvm/ir/types"
)

func main() {
    if err := foo(); err != nil {
        log.Fatalf("%+v", err)
    }
}

func foo() error {
    m := ir.NewModule()
    x := m.NewGlobalDef("x", constant.NewInt(types.I32, 42))
    f := m.NewFunc("main", types.I32)
    entry := f.NewBlock("entry")
    result := entry.NewLoad(types.I32, x)
    entry.NewRet(result)
    fmt.Println(m)
    log.Println("x.ContentType:", x.ContentType)
    log.Println("x.Type():", x.Type())
    return nil
}

Outputs the debug message:

x.ContentType: i32
x.Type(): i32*

And the LLVM IR:

@x = global i32 42

define i32 @main() {
entry:
    %0 = load i32, i32* @x
    ret i32 %0
}

The output LLVM IR is valid when run through lli

# lli a.ll ; echo $?
42

As such, the type of *ir.Global.ContentType may be a non-pointer type, and this is still valid. I think the only thing is we have to reference global variables using pointer types, that is, when you load from a global variable, or store to a global variable, then the global has pointer type (and not it's content type).

Could you provide a full test case/example which demonstrates the broken behaviour?

Cheers, Robin

dannypsnl commented 2 years ago
mod := ir.NewModule()
mod.NewGlobal("hello", types.I8)
mod.NewGlobalDef("x", constant.NewInt(types.I32, 42))
mf := mod.NewFunc("main", types.I32)
mb := mf.NewBlock("")
mb.NewRet(constant.NewInt(types.I32, 0))

write to temporary llvm file

@hello = global i8
@x = global i32 42

define i32 @main() {
0:
    ret i32 0
}

compile

llc: error: llc: tmp.ll:2:1: error: global variable reference must have pointer type
@x = global i32 42
^
dannypsnl commented 2 years ago

But have to notice, if we only write

mod.NewGlobal("hello", types.I8)

compile produces

llc: error: llc: tmp.ll:3:1: error: expected value token

so this seems more like an inconsistent error report from llc

mewmew commented 2 years ago

Ah, it's missing external.

-@hello = global i8
+@hello = external global i8
 @x = global i32 42

 define i32 @main() {
 0:
    ret i32 0
 }

Therefore, the @x is read as the initializing value of @hello.

dannypsnl commented 2 years ago

Ah, it's missing external.

-@hello = global i8
+@hello = external global i8
 @x = global i32 42

 define i32 @main() {
 0:
  ret i32 0
 }

Therefore, the @x is read as the initializing value of @hello.

Emmm, should we fix this automatically? Or check in irsem

I think the check would be a global missing initializer should with external

mewmew commented 2 years ago

Emmm, should we fix this automatically? Or check in irsem

The check can be added to irsem. If global.Init == nil and global.Linkage is the zero value (LinkageNone), or something along those lines.

dannypsnl commented 2 years ago

Then please check 30da92f3d2381f91fd7af9edfbb5633b49a647d0, if ok then I close this issue.

mewmew commented 2 years ago

LGTM, with minor comment https://github.com/llir/irsem/commit/30da92f3d2381f91fd7af9edfbb5633b49a647d0#r62298445