mobarski / morty

Morty programming language, Morty virtual machine and MortyVM assembler
GNU General Public License v3.0
2 stars 1 forks source link
concatenative language stack-machine virtual-machine

Morty

TODO

Status

This is pre-alpha work in progress!

Status:

The system is now able to compile VM ASM and to execute it. It's ready for running experiments "how feature X should be compiled and how it affects the performance".

First benchmarks are looking good - VM requires ~5 CPU instructions per VM instruction.

Index

Idea

Why:

Target usage:

Design principles:

Target VM implementations:

Morty Philosophy

  1. Start from scratch.
  2. Add parts one by one observing how they interact.
  3. Remove parts that can be cheaply built using other parts.
  4. Keep parts that are expensive to build but make building other parts easier.
  5. A state should be achieved where removing any single part should have drastic consequences.
  6. This is the system's core.
  7. Now add parts that will make the system more friendly / usable / performant.
  8. Try to cover 80% of requirements with 20% of features.

Morty Programming Language

Morty language is similar to Morty VM instruction set. Almost all VM instructions can be used directly.

In addition the language provides some high-level mechanisms:

Language Examples

def fib (n--f) :n
    n 2 lt then 1 ret do
    n 2 sub fib (f2)
    n 1 sub fib (f2 f1) add (f)
end
def total-energy (m v h -- e) :h :v :m
    m v kinetic-energy   (ek)
    m h potential-energy (ek ep) add
end

def kinetic_energy (m v -- e)
    dup (m v v) mul mul
end

def potential-energy (m h -- e)
    g (m h g) mul mul
end

The Great Win32 Computer Language Shootout:

Other:

Comments

Morty has three kinds of comments:

Named local variables

Morty allows capturing values into named local variables. To capture a value prefix target variable name with a colon (ie :my-var). Use the variable name to push its value onto the stack.

( y = a*x*x + b*x + c )

def poly2 (x a b c -- y) :c :b :a :x (capture params)
    x x mul a mul (axx)
    x b mul (axx bx) add
    c add
end

Local variables are stored on the return stack. On function end or early return they will be automatically discarded.

Conditionals

Morty supports only the simplest conditional statement:

distance 10 or-less then collision-warning do

Case requires defining new function:

def age-range (n--) :age
    age 18 or-more  then adult         ret  do
    age 13 or-more  then teen          ret  do
    age  7 or-more  then gradeschooler ret  do
    age  4 or-more  then preschooler   ret  do
    age  1 or-more  then toddler       ret  do
                         baby
end

The support for "else" might be added later.

Loops

There are two counted loop types and one indefinite type.

Counted loops:

( loop 5 times from 0 to 4 )
5 times
    i dot
loop

( loop from 2 to 9 with step 1 )
2 10 1 for
    i dot
loop

Indefinite loops:

begin
    beep
    500 ms sleep
loop

Loops are counted up.
In every loop "break" and "continue" can be used.
Minimum number of loop executions is zero.

Structures

TODO

struct point   .x .y   
struct circle  .r .x .y

def dist-to-circle (p c -- d) :c:circle :p:point 
    p.x c.x sub dup mul
    p.y c.y sub dup mul
    add sqrt               (distance to the center)
    c.r sub
end

Struct definition cannot be longer than 1 line. This is on purpose to keep the structures simple. Field names are prefixed with a dot to make them easier to grep and highlight.

Macros

There will be no macros in Morty.

Lambda functions

TODO

Global variables

TODO

{ 42 } $LAST
def gen-random (max--n)
    LAST get 3877 mul 29573 add 139968 mod LAST set
    (max) LAST get mul 139968 div (n)
end

Arrays

TODO

def .swap ( a x y -- ) :y :x :a
    a[x] .get (X)
    a[y] .get (X Y) a[x] .set
    (X) a[y] .set
end

Strings

TODO

Morty Compiler

TODO

Morty Vitual Machine

VM Architecture

VM:

Instruction encoding

Current:

Other encoding schemes that will be tested:

Instruction Set

Morty VM instructions are divided into core instructions and extensions. Only the core instruction set must be provided by the VM. The rest can be assembled by the compiler from the core instructions. If the extension instruction is available in the VM the compiler should use it for better performance. Translation from the extended set into the core set is done via simple expansion (ie "neg" -> "0 swap sub"). Translation from the core set into extended set is done by peephole optimization (ie "0 swap sub" -> "neg").

branching

name effect morty core info
jz X (v--) jz @label yes set I to X (next instruction cell) if v==0
goto X (--) yes set I to X (next instruction cell)
call X (--)(=fr) x yes call procedure at X (next instruction cell)
ret (fr*=) yes return from procedure call
qcall (a--) call ??? quick call to address from the stack without changing the frame pointer
qret (r=) ??? quick return without changing the frame pointer

stack manipulation

asm effect morty core info
push X (--x) x yes push X onto the stack
pusha X (--x) x yes push X onto the stack (X is cell address)
stor (a--)(=a) >R yes pop top of data stack onto the return stack
rtos (--a)(a=) R> yes pop top of return stack onto data stack
rget X (--a) yes get loop variable X
dup (a--aa) yes duplicate top item
drop (a--) yes drop top item
swap (ab--ba) yes swap two top items
over (ab--aba) yes
rot (abc--bca) yes rotate three items

memory access

asm effect morty core info
get (a--b) yes get value from memory cell a
set (va--) yes set memory cell a to value v
allot (n--a) yes allot n cells on the heap and return address to the first allocated cell
vget X (--n) x yes get local variable X
vset X (n--) :x yes set local variable X
gget X (--n) x yes get global variable X
gset X (n--) ??? yes set global variable X
rget X (--n) yes get loop variable X
geti X (a--b) N/A get value from memory cell a+X
seti X (va--) N/A set memory cell a+X to value v

primitive output

asm effect morty core info
emit (c--) ??? print single character
dot (n--) print number from top of the stack and one space
echo (w--) print word from top of the stack

Word is a short text (0-6 characters) encoded as an integer value. Words are intended mainly for VMs without propper string support. Each letter in the word is encoded on 5 bits. Only uppercase letters are available, terminator and 5 selected characters (TBD).

debugging

asm effect morty core info
vminfo (--) ??? print information about VM registers and show time in ms since last vminfo call or start of the program

I/O - virtual devices

I/O is handled via vectored execution (similar to ioctl, fcntl).

asm effect morty core info
ioget (dk--v) yes get value (v) for the key (k) from the device (d)
ioset (vdk--) yes set key (k) to value (v) on the device (d)

TODO: name: set vs put TODO: name: io vs dev TODO: api: dk vs kd

ALU - arithmetic

asm effect morty core info
add (ab--c) yes a + b
sub (ab--c) yes a - b
mul (ab--c) yes a * b
div (ab--c) yes a / b
muldiv (abc--d) ??? a * b / c
mulshr (abc--d) ??? a * b >> d
mod (ab--c) yes a % b
neg (a--b) yes -a
abs (a--b) yes -a if a<0 else a

ALU - bits

asm effect morty core info
and (ab--c) yes a & b
or (ab--c) yes a | b
xor (ab--c) yes a ^ b
inv (a--b) yes ~a (binary inversion)
shl (ab--c) yes a << b
shr (ab--c) yes a >> b
ushr (ab--c) unsigned shift right

ALU - comparators

asm effect morty core info
nz (a--x) bool ??? 1 if a!=0 else 0
eqz (a--x) 0= ??? 1 if a==0 else 0
gtz (a--x) 0> ??? 1 if a>0 else 0
ltz (a--x) 0< ??? 1 if a<0 else 0
eq (ab--x) == yes 1 if a == b else 0
ne (ab--x) != yes 1 if a != b else 0
lt (ab--x) below yes 1 if a < b else 0
gt (ab--x) above yes 1 if a > b else 0
le (ab--x) or-less yes 1 if a <= b else 0
ge (ab--x) or-more yes 1 if a >= b else 0
xeq (ab--abx) 1 if a == b else 0
xne (ab--abx) 1 if a != b else 0
xle (ab--abx) 1 if a <= b else 0
xge (ab--abx) 1 if a >= b else 0
xlt (ab--abx) 1 if a < b else 0
xgt (ab--abx) 1 if a > b else 0
min (ab--x) yes a if a < b else b
max (ab--x) yes a if a > b else b
pick (abc--x) yes a if c != 0 else b

Morty VM Assembler

Assembler

TODO

Instructions consist of operation (op) and argument (arg). All instruction must contain op and arg separated by the dot (.). Labels end with the colon (:). Label's addresses are referenced with the at (@) character.

Stack based labels:

    goto.@start
    inc:
        push.1 add.0 ret.0
    start:
        push.2 call.@inc
    halt.0

References

Title URL
Generic stack machine https://users.ece.cmu.edu/~koopman/stack_computers/sec3_2.html
The Pancake stack machine https://people.ece.cornell.edu/land/courses/ece5760/DE2/Stack_cpu.html
Permacomputing http://viznut.fi/texts-en/permacomputing.html
Starting Forth https://www.forth.com/starting-forth/
P-code machine https://en.wikipedia.org/wiki/P-code_machine
Stack machine https://en.wikipedia.org/wiki/Stack_machine
Thinking Forth http://thinking-forth.sourceforge.net/thinking-forth-ans.pdf
Fabris https://github.com/mobarski/fabris
UCBLogo https://en.wikipedia.org/wiki/UCBLogo
Uxn https://100r.co/site/uxn.html
Threaded code https://en.wikipedia.org/wiki/Threaded_code
Forth threading http://www.complang.tuwien.ac.at/forth/threading/
Forth threaded code http://www.complang.tuwien.ac.at/forth/threaded-code.html
Interpreters http://realityforge.org/code/virtual-machines/2011/05/19/interpreters.html
Cozy design spaces https://www.lexaloffle.com/bbs/?tid=31634
Another World https://fabiensanglard.net/another_world_polygons/
Mako https://github.com/JohnEarnest/Mako
Blacklight https://github.com/acook/blacklight/wiki
Quark https://github.com/henrystanley/Quark
Array programming https://en.wikipedia.org/wiki/Array_programming
NGA virtual machine https://github.com/crcx/nga/blob/master/Nga.md
Parable (mem slices) https://github.com/crcx/parable
Vaporisation (mem) https://slightknack.dev/blog/vaporization/
Custom Allocators https://slembcke.github.io/2020/10/12/CustomAllocators.html
Global optimizer https://www.researchgate.net/publication/213890050_A_portable_machine-independent_global_optimizer_-_design_and_measurements