corth-lang / Corth

A self-hosted stack based language like Forth
MIT License
7 stars 1 forks source link
compiler linux nasm-assembly programming-language stack-based-language

Corth

A stack based programming language I designed based on Forth and Porth from the Porth programming language series of Tsoding Daily channel. The language is quite similar to Porth, however there are several differences. The compiler was written in Python first, but then rewrote it using Corth language itself. It is now a self hosted language.

The repo consists of the compiler source code in Corth, an already compiled and assembled compiler, standard library and examples.

Requirements

How to use the compiler?

To compile a Corth program to an ELF64 executable, cd to the directory that contains the corth executable and run:

./corth compile <file-name> -i ./std/

This will create an executable called output which can be directly run.

The compiler can be bootstrapped using the bootstrap subcommand:

./corth bootstrap ./compiler/ --std ./std/

This will compile the compiler source code and place it at ./corth.

Quick start to the Corth language

First program:

// From ./examples/hello_world.corth

include "linux_86x/stdio.corth"

proc main
  int int -> int
in let argc argv in
  "Hello, World!\n" puts
end 0 end 

Numbers:

// Push and pop direction of the stack is kept on the right side in this document.
// This will hopefully help understand the concept of stack and 'let' keyword.
34        // Stack = { 34 }
0b101001  // Stack = { 34, 41 }
0o205126  // Stack = { 34, 41, 68182 }
0x5729da  // Stack = { 34, 41, 68182, 5712346 }
'a'       // Stack = { 34, 41, 68182, 5712346, 97 }
'\n'      // Stack = { 34, 41, 68182, 5712346, 97, 10 }

Strings:

"This is a string"   // Stack = { 0x648a15, 16 }

"This also is a string.
 In Corth, multi-line strings are supported."  // Stack = { 0x648a15, 16, 0x648a26, 71 }

Arithmetic operators:

34 35 + // Stack = { 69 }
69 27 - // Stack = { 69, 42 }
68 inc  // Stack = { 69, 42, 69 }
43 dec  // Stack = { 69, 42, 69, 42 }
3 23 *  // Stack = { 69, 42, 69, 42, 69 }
85 2 /  // Stack = { 69, 42, 69, 42, 69, 42 }

Include:

include "str.corth"

Stack operations:

1 2 3 // Stack = { 1, 2, 3 }
swp   // Stack = { 1, 3, 2 }
drop  // Stack = { 1, 3 }
dup   // Stack = { 1, 3, 3 }
rot   // Stack = { 3, 3, 1 }
arot  // Stack = { 1, 3, 3 }

I/O:

"Hello, world!\n" puts                      // Prints 'Hello, world!' to the standard output.
34 35 + eputu " is a nice number.\n" eputs  // Prints '69 is a nice number.' to the standard error.

Procedures:

proc arithmetic-average  // The name of the procedure is 'arithmetic-average'.
  // Procedure takes two integers as arguments, and returns a single integer.
  // The leftmost type is the oldest item in the stack.
  int int -> int
in
  // This is where the code is located.
  + 2 /
end

// This procedure will be run.
proc main
  // Right now, only 'int int -> int' argument layout is allowed for the main procedure.
  int int -> int
in let argc argv in
  "The arithmetic average of 53 and 31 is " puts 53 31 arithmetic-average putu ".\n" puts
end 0 end  // Program exits with exit code 0.

Macros:

macro sayHi 
  // Name of the macro is 'sayHi'. This means when the compiler sees 'sayHi' anywhere in the code, it will convert it to these.
  // Takes a string, the name and prints a welcome message.

  "Hi, " puts puts "!\n" puts
endmacro

proc main
  int int -> int
in let argc argv in
  "Josh" 
  sayHi  // This will be converted to this:
  // "Hi, " puts puts "!\n" puts
end 0 end

Name scopes:

Comments:

// This is a line comment, it can also come after code
34 35 + putu  // Just like that

/*
  This is a block comment, aka a multi-line comment.
  Block comments can span several lines.
*/

Control flow:

include "linux_x86/stdio.corth"

proc main
  int int -> int
in let argc argv in
  2 2 + 5 = if
    "Well, math is broken. Nice.\n" puts // Hopefully won't be printed.
  end

  3 4 > if
    "Your computer has virus\n" puts     // Hopefully won't be printed.
  else
    "Your computer is alright\n" puts    // Hopefully will be printed.
  end

  // Count from 0 to 9.
  "First 10 numbers from 0 are,\n" puts
  0 while dup 10 < do                    // Duplicate the number, and check if it is less then 10.
    dup putu "\n" puts                   // Print the number.
  inc end drop                           // Increase the number.
end 0 end

Let:

 // The rightmost variable collects the newest item in the stack. So x = 3 and y = 4.
 // From now on, x will be replaced with 3 and y will be replaced with 4.
 // If 'let' variables are compile-time constants, the variables will be directly replaced for optimization reasons.
 // Otherwise, the stack values will be stored in local memory and loaded inside the structure.
 3 4 let x y in
   x y * x + y -
 end
 // Stack = { 10 }

Peek:

 // The rightmost variable collects the newest item in the stack. So x = 3 and y = 4.
 // From now on, x will be replaced with 3 and y will be replaced with 4.
 3 4 peek x y in
   x y * x + y -
 end
 // Stack = { 3, 4, 10 }

Static memory management:

include "core/memory.corth"

// This is a global variable.
// The size must be a compile-time constant as memory is allocated in compile-time.
memory count sizeof(int) end

proc increase -> in
  // Reads the value of 'count', adds one and writes back.
  count @64 inc count !64
end

proc main
  int int -> int
in let argc argv in
  // Set the value of 'count' to 0.
  0 count !64

  // Increase the value of 'count'.
  increase

  // Print the value stored at 'count'. Should print '1'.
  count @64 putu putnl

  memory x sizeof(int) in
  memory y sizeof(int) in // Size of the variable 'x' is equal to the size of an integer (8 bytes). Same with variable 'y'.

    0 x !64
    x @64 puti " is before saving 'x'\n" puts
    420 x !64
    x @64 puti " is after saving 'x'\n" puts

    0 y !64
    y @64 puti " is before saving 'y'\n" puts
    69 y !64
    y @64 puti " is after saving 'y'\n" puts

  end end
end 0 end

Dynamic memory management:

include "dynamic/malloc.corth"

// Assume this is inside a procedure.
100 malloc let buffer in  // Allocate 100 bytes of memory and save the object pointer as a constant named `buffer`.
  // Loop through every byte in `buffer`.
  buffer while dup buffer 100 + < do peek address in
    0x67 address !8  // Set the byte to 0x67.
  end inc end drop

  buffer mfree drop // Deallocate the space.
end

Collections: