skx / assembler

Basic X86-64 assembler, written in golang
GNU General Public License v2.0
66 stars 11 forks source link
assembler assembly compiler golang x86-64

GoDoc Go Report Card license

Assembler

This repository contains a VERY BASIC x86-64 assembler, which is capable of reading assembly-language input, and generating a staticly linked ELF binary output.

It is more a proof-of-concept than a useful assembler, but I hope to take it to the state where it can compile the kind of x86-64 assembly I produce in some of my other projects.

Currently the assembler will generate a binary which looks like this:

$ file a.out
a.out: ELF 64-bit LSB executable, x86-64, version 1 (SYSV)
       statically linked, no section header

Why? I've written a couple of toy projects that generate assembly language programs, then pass them through an assembler:

The code in this repository was born out of the process of experimenting with generating an ELF binary directly. A necessary learning-process.

Limitations

We don't support anywhere near the complete instruction-set which an assembly language programmer would expect. Currently we support only things like this:

Note that we really only support the following registers, you'll see that we only support the 64-bit registers (which means rax is supported but eax, ax, ah, and al are specifically not supported):

There is some support for the extended registers r8-r15, but this varies on a per-instruction basis and should not be relied upon.

There is support for storing fixed-data within our program, and locating that. See hello.asm for an example of that.

We also have some other (obvious) limitations:

Installation

If you have this repository cloned locally you can build the assembler like so:

cd cmd/assembler
go build .
go install .

If you wish to fetch and install via your existing toolchain:

go get -u github.com/skx/assembler/cmd/assembler

You can repeat for the other commands if you wish:

go get -u github.com/skx/assembler/cmd/lexer
go get -u github.com/skx/assembler/cmd/parser

Of course these binary-names are very generic, so perhaps better to work locally!

Example Usage

Build the assembler:

 $ cd cmd/assembler
 $ go build .

Compile the sample program, and execute it showing the return-code:

 $ cmd/assembler/assembler test.asm && ./a.out ; echo $?
 9

Or run the hello.asm example:

 $ cmd/assembler/assembler  hello.in && ./a.out
 Hello, world
 Goodbye, world

You'll note that the \n character was correctly expanded into a newline.

Internals

The core of our code consists of a small number of simple packages:

In addition to the package modules we also have a couple of binaries:

These commands located beneath cmd each operate the same way. They each take a single argument which is a file containing assembly-language instructions.

For example here is how you'd build and test the parser:

cd cmd/parser
go build .
$ ./parser ../../test.asm
&{{INSTRUCTION xor} [{REGISTER rax} {REGISTER rax}]}
&{{INSTRUCTION inc} [{REGISTER rax}]}
&{{INSTRUCTION mov} [{REGISTER rbx} {NUMBER 0x0000}]}
&{{INSTRUCTION mov} [{REGISTER rcx} {NUMBER 0x0007}]}
&{{INSTRUCTION add} [{REGISTER rbx} {REGISTER rcx}]}
&{{INSTRUCTION mov} [{REGISTER rcx} {NUMBER 0x0002}]}
&{{INSTRUCTION add} [{REGISTER rbx} {REGISTER rcx}]}
&{{INSTRUCTION int} [{NUMBER 0x80}]}

Adding New Instructions

This is how you might add a new instruction to the assembler, for example you might add jmp 0x00000 or some similar instruction:

Debugging Generated Binaries

Launch the binary under gdb:

$ gdb ./a.out

Start it:

(gdb) starti
Starting program: /home/skx/Repos/github.com/skx/assembler/a.out

Program stopped.
0x00000000004000b0 in ?? ()

Dissassemble:

(gdb)  x/5i $pc

Or show string-contents at an address:

(gdb) x/s 0x400000

Bugs?

Feel free to report, as this is more a proof of concept rather than a robust tool they are to be expected.

Specifically we're missing support for many instructions, but I hope the code generated for those that is present is correct.

Steve