mewmew / uc

A compiler for the µC language.
58 stars 5 forks source link

ir: Define the data types used to represent LLVM IR #62

Closed mewmew closed 8 years ago

mewmew commented 8 years ago

To identify a relevant subset of LLVM IR capable of representing the actions of the uC test case source files, Clang has been used with the -emit-llvm flag to convert the uC test case source files to LLVM IR (e.g. quiet/semantic/s02.ll).

A grammar for the identified subset of LLVM IR has been expressed in BNF with Gocc syntax at https://github.com/llir/spec/blob/master/gocc/ll.bnf

This issue tracks the work to define the data types required to represent the relevant parts of the identified subset of LLVM IR, which should include:

The high-level representation of a function is as follows: A function consists of one or more basic blocks. A basic block is a sequence of zero or more non-terminating instructions followed by a terminating instruction. The intuition behind basic blocks is that if one of the instructions within the basic block is executed, then all instructions within the basic block are executed. An instruction may be non-terminating (e.g. xor, add) or terminating (e.g. ret, br).

Feel free to discuss how we may wish to represent these concepts. We may explore the benefits and drawbacks of different data type representations (e.g. hash map (from basic block name to basic block) vs. slice (of basic blocks) for keeping track of the basic blocks within a function).

mewmew commented 8 years ago

Preliminary proposal, just to get the ball rolling, and to give us a starting point to evaluate the pros and cons of other data type representations against.

type Global struct {
   Name string
   Val Value
}

type Function struct {
   Name string
   Blocks []*BasicBlock
}

type BasicBlock struct {
   Name string
   Insts []Instruction
   Term Terminator
}

type Instruction interface {
   isInst()
}

type Terminator interface {
   isTerm()
}

type Value interface {
   isVal()
}

Pros:

Cons:

@sangisos, what do you think?

sangisos commented 8 years ago

The use of a slice should be fine if we change the implementation of the IR code generation to be recursive, which I agree we should.

mewmew commented 8 years ago

The use of a slice should be fine if we change the implementation of the IR code generation to be recursive, which I agree we should.

Sounds like a plan. We can always evaluate the implementation against other designs later and redefine it if we happen to discover a better™ representation.

sangisos commented 8 years ago

I have already started making the little code I had naïvely recursive. Do we have any way to translate uc types to ir types?

mewmew commented 8 years ago

Do we have any way to translate uc types to ir types?

They should map pretty straight forward to llvm/ir/types. During the translation, we may use something along the lines of map[uctypes.Type]irtypes.Type?, which we populate by the corresponding types required for translation of a given unit. E.g. something along the lines of.

typeMap := make(map[uctypes.Type]irtypes.Type)
...
switch n := node.(type) {
case *ast.Expr:
   ucType := n.Type()
   irType, ok := typeMap[ucType]
   if !ok {
      irType, err := translateType(ucType)
      if err != nil {
         return errutil.Err(err)
      }
   }
}

func translateType(ucType uctypes.Type) (irtypes.Type, error) {
    switch ucType := ucType.(type) {
    case *uctypes.Basic
        switch ucType.Kind {
        case uctypes.Char:
            return irtypes.NewInt(8)
        case uctypes.Int:
            // TODO: Assmue 32-bit platform?
            return irtypes.NewInt(32)
        default:
            panic(fmt.Sprintf("support for translating basic type kind %v not yet implemented.", ucType.Kind))
        }
    case *uctypes.Array:
        elem, err := translateType(ucType.Elem())
        if err != nil {
            return nil, errutil.Err(err)
        }
        return irtypes.NewArray(elem, ucType.Len())
    default:
        panic(fmt.Sprintf("support for translating type %T not yet implemented.", ucType))
    }
}
mewmew commented 8 years ago

The LLVM IR library is feature complete with regards to the requirements of the uC compiler as of llir/llvm@9af80ed60e023b18309c4c67033119932519adbe. Closing this issue.