llir / llvm

Library for interacting with LLVM IR in pure Go.
https://llir.github.io/document/
BSD Zero Clause License
1.19k stars 78 forks source link

`module.NewTypeDef` has unexpected side effects #226

Closed klvnptr closed 1 year ago

klvnptr commented 1 year ago

The following code has unexpected side effects on the convenience types:

m.NewTypeDef("hello", types.I64)

Because NewTypeDef sets the type's name. Which is okay, but next time if I use types.I64 it will have name hello since, the convenience type is a pointer. To be honest, I don't really have an idea how to solve this. But it took me about 3 hours of debugging till I figured out why my i64 types called hello in LLVM IR.

dannypsnl commented 1 year ago

Do you want a type is an alias of i64 here?

klvnptr commented 1 year ago

Yes. So my first module used hello as an alias of type i64, but in my second module I didn't want that alias. Since I accidentally overwritten the convenience type's name, the second module used hello as well.

dannypsnl commented 1 year ago

@mewmew I guess here have two problems

  1. the builtin IntType shouldn’t get affected by SetName method, it should panic in this case.
  2. And NewTypeDef will need better explanation, I can’t quite sure its semantic right now
dannypsnl commented 1 year ago

Yes. So my first module used hello as an alias of type i64, but in my second module I didn't want that alias. Since I accidentally overwritten the convenience type's name, the second module used hello as well.

Because I64 is a convenience instance of IntType, the renaming applied to instance than every references get affected.

I think the quick fix here is clone the I64 out for now.

klvnptr commented 1 year ago

Yes, this is what I ended up doing:

func clone(typ types.Type) types.Type {
    switch typ := typ.(type) {
    case *types.VoidType:
        return &types.VoidType{}
    case *types.MMXType:
        return &types.MMXType{}
    case *types.LabelType:
        return &types.LabelType{}
    case *types.TokenType:
        return &types.TokenType{}
    case *types.MetadataType:
        return &types.MetadataType{}

    case *types.IntType:
        return &types.IntType{BitSize: typ.BitSize}
    case *types.FloatType:
        return &types.FloatType{Kind: typ.Kind}
    case *types.PointerType:
        return &types.PointerType{ElemType: typ.ElemType, AddrSpace: typ.AddrSpace}

    case *types.ArrayType:
        return &types.ArrayType{Len: typ.Len, ElemType: typ.ElemType}
    case *types.VectorType:
        return &types.VectorType{Scalable: typ.Scalable, Len: typ.Len, ElemType: typ.ElemType}
    case *types.FuncType:
        return &types.FuncType{RetType: typ.RetType, Params: typ.Params, Variadic: typ.Variadic}
    case *types.StructType:
        return &types.StructType{Packed: typ.Packed, Fields: typ.Fields, Opaque: typ.Opaque}
    }

    panic(fmt.Errorf("support for type %T not yet implemented", typ))
}
dannypsnl commented 1 year ago

Cloning is painful in Go, this is also why we have no idea how to provide a general solution in this case.

klvnptr commented 1 year ago

It's not perfect, but gets the job done for now.

mewmew commented 1 year ago

Hi Peter,

Happy to see you exploring the unknowns of LLVM IR in Go : ) Also saw that you came in touch with @alecthomas's participle library, a really cool take on parsing with EBNF grammars.

With regards to cloneType. Yeah, cloning is painful (ref: https://github.com/llir/llvm/issues/115#issuecomment-563259036). We've run into that before, and looked at various packages for deep copy, but each package had their own trade-offs. I don't think we'll find a good solution for this, but I would love to be proven wrong : )

As for this issue. As you've both mentioned, this is an unfortunate side effect of changing aspects of global variables of the types package. The way to create a type alias for i64 without changing the types.I64 convenience global variable would be as follows:

m := ir.NewModule()
i64 := types.NewInt(64)
helloType := m.NewTypeDef("hello", i64)

With regards to the unexpected side effects, I think we can add some documentation to ir.Module.NewTypeDef to help avoid this for other users.

Cheerful regards, Robin

mewmew commented 1 year ago

Added docs in 5d46090a753955ae357e1c39b03dbf511b3ae4bf to help avoid the unexpected side effect in the future. Let me know if the docs are clear enough, or if they need to be rephrased.

Cheers, Robin

dannypsnl commented 1 year ago

wow, I don't even remember there has NewInt(64)......

klvnptr commented 1 year ago

Hey @mewmew

Highly appreciate your feedback. Yes, I'm writing an LLVM frontend for my hobby programming language :) Btw. your library is pretty stunning. I wouldn't even try to guess how much time it takes to map out the entire LLVM IR syntax.

Thanks for the suggestion. I will go with types.NewInt(64).

While we are at it. Whats to way to create an alias of an alias? Similar to this.

typedef i64 hello;
typedef hello world;

Thanks Peter

klvnptr commented 1 year ago

@mewmew Few constructors are missing for types like float or double, there is no NewFloat(...), so I went with creating all types manuall like:

&types.FloatType{Kind: types.FloatKindFloat}

Also as far as I can tell, there is no way to make an alias (TypeDef) to another alias, like:

typedef i64 hello;
typedef hello world;

Because ir.Module's WriteTo() TypeDef declaration I think makes it impossible. But please let me know if I'm wrong. I'm pretty newbie to the library.

    for _, t := range m.TypeDefs {
        // Name=LocalIdent '=' 'type' Typ=OpaqueType
        //
        // Name=LocalIdent '=' 'type' Typ=Type
        fw.Fprintf("%s = type %s\n", t, t.LLString())
    }
mewmew commented 1 year ago

Highly appreciate your feedback. Yes, I'm writing an LLVM frontend for my hobby programming language :) Btw. your library is pretty stunning. I wouldn't even try to guess how much time it takes to map out the entire LLVM IR syntax.

Haha, happy to hear you like the library! Yeah, quite a lot of time went into writing the BNF grammar for LLVM IR, but it was fun too!

Also, is the repo for your hobby language open source? Would be cool to check out.

Because ir.Module's WriteTo() TypeDef declaration I think makes it impossible. But please let me know if I'm wrong. I'm pretty newbie to the library.

Hmm, I think you're right. If you really want it, you could create a custom type which implements the types.Type interface.

Here's a proof of concept, it's not pretty, but it seems to work ^^

package main

import (
    "fmt"

    "github.com/llir/llvm/ir"
    "github.com/llir/llvm/ir/types"
)

func main() {
    m := ir.NewModule()
    i64 := types.NewInt(64)
    helloType := m.NewTypeDef("hello", i64)
    aliasType := &AliasType{
        name: "world",
        Type: helloType,
    }
    m.TypeDefs = append(m.TypeDefs, aliasType)
    fmt.Println(m)
}

type AliasType struct {
    // alias name.
    name string
    // underlying type.
    types.Type
}

func (a *AliasType) Name() string {
    return a.name
}

func (a *AliasType) SetName(name string) {
    a.name = name
}

func (a *AliasType) String() string {
    return "%" + a.name // NOTE: the proper way to encode local identifiers is to use `enc.LocalName`.
}

func (a *AliasType) LLString() string {
    if len(a.Type.Name()) == 0 {
        return a.Type.LLString()
    }
    return "%" + a.Type.Name() // NOTE: the proper way to encode local identifiers is to use `enc.LocalName`.
}

NOTE: the proper way to encode local identifiers is to use enc.LocalName. It's internal as we've had no external users request the functionality thus far. If needed, we can move this to irutil/enc or something. See https://github.com/llir/llvm/blob/5d46090a753955ae357e1c39b03dbf511b3ae4bf/internal/enc/enc.go#L49

Cheers, Robin

klvnptr commented 1 year ago

Genius. I totally overlooked the fact that types.Type is an interface that can be implemented. Thanks so much for the example! Yeah sure, I just open-sourced it here: https://github.com/klvnptr/k I welcome any feedback.

klvnptr commented 1 year ago

@mewmew I tried to use the above suggested AliasType, but I don't think it will work. Simple instructions like NewTrunc can't handle them. For example if AliasType's underlying Type is IntType, then NewTrunc is not able to cast it to another IntType :(

mewmew commented 1 year ago

Hi @klvnptr! Wish you a happy new years : )

@mewmew I tried to use the above suggested AliasType, but I don't think it will work. Simple instructions like NewTrunc can't handle them. For example if AliasType's underlying Type is IntType, then NewTrunc is not able to cast it to another IntType :(

Ah, damn. That sucks. I know we've been thinking of how to make the internal llir/llvm code less fragile to user-defined types. Right now, it relies a lot on type assertions and type switches. A more robust approach would change these to use interfaces capturing the semantics of each specific type (see #59 for the work on rethinking how to handle user-defined types).

So, sorry for the pain. This is work we wish to do. But as with many things in life, time is not always on our side.

Don't wait for a quick change on this one as it will require quite fundamental restructuring of the internal logic of llir/llvm. However, it is definitely a change we wish to see happen. So "at some point in the future" ™ this will work as expected : )

Until then, we'll have to find joys in other parts of life ^^

Wish you a great year to come! Robin

klvnptr commented 1 year ago

hey Robin, no worries, I will work around it :) thanks much.