nlvm (the nim-level virtual machine?) is an LLVM-based compiler for the Nim programming language.
From Nim's point of view, it's a backend just like C or JavaScript - from LLVM's point of view, it's a language frontend that emits IR.
Questions, patches, improvement suggestions and reviews welcome. When you find bugs, feel free to fix them as well :)
Fork and enjoy!
Jacek Sieka (arnetheduck on gmail point com)
nlvm
works as a drop-in replacement for nim
with the following notable differences:
C
compiler stepgdb
/lldb
debug information with source stepping, type
information etcwasm32
support with no extra toolinglld
)nlvm r
) using the LLVM ORCv2 JITMost things from nim
work just fine (see the porting guide below!):
nim
to nlvm
!)C
header files are not used - the declaration in the .nim
file needs to be accurateTest coverage is not too bad either:
How you could contribute:
osx
and windows
should be easy, arm
would be
nice)nlvm
generate better IR - optimizations, builtins, exception handling..nlvm
-compatiblenlvm
does not:
C
- as a consequence, header
, emit
and similar pragmas
will not work - neither will the fancy importcpp
/C++
features - see the porting guide below!To do what I do, you will need:
gcc
most of the time)Start with a clone:
cd $SRC
git clone https://github.com/arnetheduck/nlvm.git
cd nlvm && git submodule update --init
We will need a few development libraries installed, mainly due to how nlvm
processes library dependencies (see dynlib section below):
# Fedora
sudo dnf install pcre-devel openssl-devel sqlite-devel ninja-build cmake
# Debian, ubuntu etc
sudo apt-get install libpcre3-dev libssl-dev libsqlite3-dev ninja-build cmake
Compile nlvm
(if needed, this will also build nim
and llvm
):
make
Compile with itself and compare:
make compare
Run test suite:
make test
make stats
You can link statically to LLVM to create a stand-alone binary - this will use a more optimized version of LLVM as well, but takes longer to build:
make STATIC_LLVM=1
If you want a faster nlvm
, you can also try the release build - it will be
called nlvmr
:
make STATIC_LLVM=1 nlvmr
When you update nlvm
from git
, don't forget the submodule:
git pull && git submodule update
To build a docker image, use:
make docker
To run built nlvm
docker image use:
docker run -v $(pwd):/code/ nlvm c -r /code/test.nim
On the command line, nlvm
is mostly compatible with nim
.
When compiling, nlvm
will generate a single .o
file with all code from your
project and link it using $CC
- this helps it pick the right flags for
linking with the C library.
cd $SRC/nlvm/Nim/examples
../../nlvm/nlvm c fizzbuzz
If you want to see the generated LLVM IR, use the -c
option:
cd $SRC/nlvm/Nim/examples
../../nlvm/nlvm c -c fizzbuzz
less fizzbuzz.ll
You can then run the LLVM optimizer on it:
opt -Os fizzbuzz.ll | llvm-dis
... or compile it to assembly (.s
):
llc fizzbuzz.ll
less fizzbuzz.s
Apart from the code of your .nim
files, the compiler will also mix in the
compatibility found library in nlvm-lib/
.
Generally, the nim
compiler pipeline looks something like this:
nim --> c files --> IR --> object files --> linker --> executable
In nlvm
, we remove one step and bunch all the code together:
nim --> single IR file --> built-in LTO linker --> executable
Going straight to the IR means it's possible to express nim constructs more
clearly, allowing llvm
to understand the code better and thus do a better
job at optimization. It also helps keep compile times down, because the
c-to-IR
step can be avoided.
The practical effect of generating a single object file is similar to
clang -fwhole-program -flto
- it is a bit more expensive in terms of memory,
but results in slightly smaller and faster binaries. Notably, the
IR-to-machine-code
step, including any optimizations, is repeated in full for
each recompile.
nim
uses a runtime dynamic library loading scheme to gain access to shared
libraries. When compiling, no linking is done - instead, when running your
application, nim
will try to open anything the user has installed.
nlvm
does not support the {.dynlib.}
pragma - instead you can use
{.passL.}
using normal system linking.
# works with `nim`
proc f() {. importc, dynlib: "mylib" .}
# works with both `nim` and `nlvm`
{.passL: "-lmylib".}
proc f() {. importc .}
When nim
compiles code, it will generate c
code which may include other
c
code, from headers or directly via emit
statements. This means nim
has
direct access to symbols declared in the c
file, which can be both a feature
and a problem.
In nlvm
, {.header.}
directives are ignored - nlvm
looks strictly at
the signature of the declaration, meaning the declaration must exactly match
the c
header file or subtly ABI issues and crashes ensue!
# When `nim` encounters this, it will emit `jmp_buf` in the `c` code without
# knowing the true size of the type, letting the `c` compiler determine it
# instead.
type C_JmpBuf {.importc: "jmp_buf", header: "<setjmp.h>".} = object
# nlvm instead ignores the `header` directive completely and will use the
# declaration as written. Failure to correctly declare the type will result
# in crashes and subtle bugs - memory will be overwritten or fields will be
# read from the wrong offsets.
#
# The following works with both `nim` and `nlvm`, but requires you to be
# careful to match the binary size and layout exactly (note how `bycopy`
# sometimes help to further nail down the ABI):
when defined(linux) and defined(amd64):
type
C_JmpBuf {.importc: "jmp_buf", bycopy.} = object
abi: array[200 div sizeof(clong), clong]
# In `nim`, `C` constant defines are often imported using the following trick,
# which makes `nim` emit the right `C` code that the value from the header
# can be read (no writing of course, even though it's a `var`!)
#
# assuming a c header with: `#define RTLD_NOW 2`
# works for nim:
var RTLD_NOW* {.importc: "RTLD_NOW", header: "<dlfcn.h>".}: cint
# both nlvm and nim (note how these values often can be platform-specific):
when defined(linux) and defined(amd64):
const RTLD_NOW* = cint(2)
To deal with emit
, the recommendation is to put the emitted code in a C file
and {.compile.}
it.
proc myEmittedFunction() {.importc.}
{.compile: "myemits.c".}
void myEmittedFunction() {
/* ... */
}
Similar to {.emit.}
, {.asm.}
functions must be moved to a separate file and
included in the compilation with {.compile.}
- this works both with .S
and
.c
files.
Use --cpu:wasm32 --os:standalone --gc:none
to compile Nim to (barebones) WASM.
You will need to provide a runtime (ie WASI) and use manual memory allocation as the garbage collector hasn't yet been ported to WASM and the Nim standard library lacks WASM / WASI support.
To compile wasm files, you will thus need a panicoverride.nim
- a minimal
example looks like this and discards any errors:
# panicoverride.nim
proc rawoutput(s: string) = discard
proc panic(s: string) {.noreturn.} = discard
After placing the above code in your project folder, you can compile .nim
code to wasm32
:
# myfile.nim
proc adder*(v: int): int {.exportc.} =
v + 4
nlvm c --cpu:wasm32 --os:standalone --gc:none --passl:--no-entry myfile.nim
wasm2wat -l myfile.wasm
Most WASM-compile code ends up needing WASM extensions - in particular, the bulk memory extension is needed to process data.
Extensions are enabled by passing --passc:-mattr=+feature,+feature2
, for example:
nlvm c --cpu:wasm32 --os:standalone --gc:none --passl:--no-entry --passc:-mattr=+bulk-memory
Passing --passc:-mattr=help
will print available features (only works while compiling, for now!)
To use functions from the environment (with importc
), compile with --passl:-Wl,--allow-undefined
.
nlvm
supports directly running Nim code using just-in-time compilation:
# Compile and run `myfile.nim` without creating a binary first
nlvm r myfile.nim
This mode can also be used to run code directly from the standard input:
$ nlvm r
.......................................................
>>> log2(100.0)
stdin(1, 1) Error: undeclared identifier: 'log2'
candidates (edit distance, scope distance); see '--spellSuggest':
(2, 2): 'low' [proc declared in /home/arnetheduck/src/nlvm/Nim/lib/system.nim(1595, 6)]
...
>>> import math
.....
>>> log2(100.0)
6.643856189774724: float64