llir / llvm

Library for interacting with LLVM IR in pure Go.
https://llir.github.io/document/
BSD Zero Clause License
1.19k stars 78 forks source link

irutil: add stdlib package for C standard library declarations #188

Open mewmew opened 3 years ago

mewmew commented 3 years ago

As suggested in https://github.com/llir/llvm/pull/187#issuecomment-860148771:

I would say the biggest difficulty was the C standard library - it would be super cool to have a stdlib package, with bindings to the standard library. In the compiler/builtins.go file I wrote out the function signatures for a bit of the stdlib, but it would be awesome to automatically have bindings to the whole C stdlib.

Add a new irutil/stdlib package containing function (and global variable) declarations for interacting with the C standard library.

We should consider automatically doing this, perhaps using the LLVM compiler to parse the C standard library headers and generating LLVM IR function (and global variable) declarations.

Then, we could parse the LLVM IR output using llir/llvm/asm to get the llir/llvm/ir representation to interact with.

Will require some experimentation to find what approach works well, and is easy to work with.

Edit: related issues #22, #178, #180.

dannypsnl commented 3 years ago

@mewmew Would you like to create the issue in llir/irutil?

dannypsnl commented 3 years ago

Maybe we will also need llvm built-in variables/functions? C-API is a bit dangerous, they are added by linker, not always will be there.

mewmew commented 3 years ago

@mewmew Would you like to create the issue in llir/irutil?

Lets keep all issues in one tracker for now (in llir/llvm/issues). It's easier to get an overview of the entire llir project that way.

Maybe we will also need llvm built-in variables/functions? C-API is a bit dangerous, they are added by linker, not always will be there.

We can add two packages, irutil/stdlibc for C function declarations (e.g. printf) and irutil/instrinsic for official LLVM intrinsic functions (e..g @llvm.pow.f32).

Nv7-GitHub commented 3 years ago

I have implemented some of the most used stdlibc functions at https://github.com/Nv7-GitHub/bpp/blob/ff2d32542a2b493cc5eaa7e75f349371d9e99111/old/compiler/builtins.go#L46 if that would be any help

mewmew commented 3 years ago

I have implemented some of the most used stdlibc functions at https://github.com/Nv7-GitHub/bpp/blob/ff2d32542a2b493cc5eaa7e75f349371d9e99111/old/compiler/builtins.go#L46 if that would be any help

Definitely helpful. I think we can evaluate a few different approaches before settling on the one to use. I would wish for us also to consider maintenance of the stdlibc as those headers are updated and e.g. new functions are added. It may be possible to automatically generate LLVM IR code by parsing the std header files. Or perhaps that would be a crazy idea. Still, worth investigating as that would help with maintenance of these as well.

Cheers, Robin

Nv7-GitHub commented 3 years ago

Can clang produce LLVM from headers?

mewmew commented 3 years ago

Can clang produce LLVM from headers?

At least using a tiny dummy C file including the headers.

Input (dummy) C file:

#include <stdio.h>
#include <string.h>

void *foo1 = printf;
void *foo2 = memcpy;

Run:

clang -S -emit-llvm -o a.ll a.c

Output LLVM IR:

; ModuleID = 'a.c'
source_filename = "a.c"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-pc-linux-gnu"

@foo1 = dso_local global i8* bitcast (i32 (i8*, ...)* @printf to i8*), align 8
@foo2 = dso_local global i8* bitcast (i8* (i8*, i8*, i64)* @memcpy to i8*), align 8

declare i32 @printf(i8*, ...) #0

; Function Attrs: nounwind
declare i8* @memcpy(i8*, i8*, i64) #1

attributes #0 = { "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { nounwind "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" "unsafe-fp-math"="false" "use-soft-float"="false" }

!llvm.module.flags = !{!0, !1, !2}
!llvm.ident = !{!3}

!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{i32 7, !"PIC Level", i32 2}
!2 = !{i32 7, !"PIE Level", i32 2}
!3 = !{!"clang version 12.0.1"}

The output LLVM IR contains the function declarations for memcpy and printf.

Nv7-GitHub commented 3 years ago

Perhaps by getting a list of stdlib functions and then doing this for the stdlib we could get a list of function declarations, and we could parse them using this library and get the output

But how would the API look? A map of function name to function? A bunch of global variables with the functions? A function to get a stdlib fn by name?

mewmew commented 3 years ago

But how would the API look? A map of function name to function? A bunch of global variables with the functions? A function to get a stdlib fn by name?

Good question. I'm not quite sure that the best API would. How would you envision yourself wishing to use it @Nv7-GitHub? Perhaps that could help guide API design.

Cheers, Robin

dannypsnl commented 3 years ago

I'm pretty sure A map of name to function would be the best, we simply modify Module to do so.