FlatAssembler / AECforWebAssembly

A port of ArithmeticExpressionCompiler from x86 to WebAssembly, so that the programs written in the language can run in a browser. The compiler has been rewritten from JavaScript into C++.
https://flatassembler.github.io/AEC_specification
MIT License
32 stars 2 forks source link
emscripten javascript javascript-virtual-machine wasm webassembly

AEC for WebAssembly

Logo

This is my attempt to port the ArithmeticExpressionCompiler language to WebAssembly (the JavaScript bytecode). The compiler has been rewritten from scratch in C++ (the original compiler was written in a combination of C and JavaScript, using the Duktape framework). Right now, it's about as powerful as more powerful than the original compiler (targeting x86 assembly), as it includes support for more data types, including the support for structures. The specification for the ArithmeticExpressionCompiler language is (hopefully) available on my website.

Compiling instructions

More detailed compiling instructions, as well as compiling instructions for the original ArithmeticExpressionCompiler, are available on my website. They are somewhat Linux-specific, but I do not think that is a problem. Anyway, here are some less detailed compiling instructions...

While this project includes a CMAKE build script, this is primarily to make it easier to open this project in various IDEs and for automated testing on GitHub and GitLab. You should be able to compile this project simply as follows:

g++ -o aec AECforWebAssembly.cpp -std=c++11 -O3
#-std=c++11 is only needed for older versions of GCC.

Or with:

clang++ -o aec AECforWebAssembly.cpp -O3

At least some compilers are known to miscompile the program. Namely, GCC 4.8.5 (that comes with recent versions of Oracle Linux) compiles the code successfully, but the program exits with the message that your C++ compiler appears not to support regular expressions. GCC 4.9.2 (that comes with recent versions of Debian) already seems to work. CLANG 10 on Oracle Linux again miscompiles the program, but apparently that happens only on Oracle Linux. Oddly enough, CLANG 9 appears to work fine on Oracle Linux. Honestly, supporting misbehaving C++ compilers makes way less sense than supporting misbehaving browsers. C++ compilers, unlike browsers, are used only by users who can easily install another one. Also, newest versions of GCC, unlike the latest browsers, run on Windows 98 without problems. (UPDATE: I've mitigated that problem by avoiding complicated regular expressions, which some C++ compilers fail to compile. Besides, those complicated regular expressions appear to slow down the program significantly and make it less legible.)

To make sure compiling my compiler is not the problem, I've included some binary files in the releases section on GitHub, and they are also available on SourceForge (but not on GitLab). There is also a WebAssembly file compiled with Emscripten, which should work basically wherever NodeJS works.

Usage instructions

Linux:

./aec name_of_the_aec_program.aec
#If everything is fine, it should produce WebAssembly text file named
# "name_of_the_aec_program.wat".
wat2wasm name_of_the_aec_program.wat

Windows:

aec name_of_the_aec_program.aec
wat2wasm name_of_the_aec_program.wat

Of course, you need to have wat2wasm from WebAssembly Binary Toolkit installed. The simplest way to install it is probably to globally install the wabt package from the Node Package Manager (NPM), assuming you already have NodeJS installed. After you run the command above, you'll get a WASM file, which nearly all modern JavaScript environments support (but notice that the WASM files produced by this compiler assume WebAssembly.Global exists, which is not true in some older JavaScript environments which otherwise support WebAssembly, such as Firefox 52).

Note that browsers will generally not allow you to run a WASM executable stored on your file system, you need to use some web-server (I use php -S 127.0.0.1:8080) to run it in a browser. It's for security reasons. When running a program stored on the Internet, the JavaScript Virtual Machine can be sandboxed from the file system. When running a program stored on your computer, the JavaScript Virtual Machine can't be sandboxed from the file system (it needs to read the program that's stored on your file system). Modern browsers generally allow JavaScript programs on your file system to run, because the JavaScript compiler (V8, SpiderMonkey, ChakraCore...) is trusted not to produce WebAssembly code which will cause the JavaScript Virtual Machine to read private files stored on the computer or modify files. An untrusted WASM file might contain some obscure piece of WebAssembly which will cause a particular JavaScript Virtual Machine to misbehave.

WASM files can be run in NodeJS, as you can see in this example. NodeJS is (unlike browsers) targeted at tech-savvy users, and it trusts you to verify that the WASM file is not malicious before you run it. Well, when compiling an AEC program, you can check both the source code of that program and the source code of the compiler, so that shouldn't be a problem here.

Shell script

So, to summarize how to compile and use my compiler, I've written a shell script to download the compiler, compile and assemble the Analog Clock example program, and run it in NodeJS:

if [ $(command -v git > /dev/null 2>&1 ; echo $?) -eq 0 ]
then
  git clone https://github.com/FlatAssembler/AECforWebAssembly.git
  cd AECforWebAssembly
elif [ $(command -v wget > /dev/null 2>&1 ; echo $?) -eq 0 ]
then
  mkdir AECforWebAssembly
  cd AECforWebAssembly
  wget https://github.com/FlatAssembler/AECforWebAssembly/archive/refs/heads/master.zip
  unzip master.zip
  cd AECforWebAssembly-master
else
  mkdir AECforWebAssembly
  cd AECforWebAssembly
  curl -o AECforWebAssembly.zip -L https://github.com/FlatAssembler/AECforWebAssembly/archive/refs/heads/master.zip # Without the "-L", "curl" will store HTTP Response headers of redirects to the ZIP file instead of the actual ZIP file.
  unzip AECforWebAssembly.zip
  cd AECforWebAssembly-master
fi
if [ $(command -v g++ > /dev/null 2>&1 ; echo $?) -eq 0 ]
then
  g++ -std=c++11 -o aec AECforWebAssembly.cpp # "-std=c++11" should not be necessary for newer versions of "g++". Let me know if it is, as that probably means I disobeyed some new C++ standard (say, C++23).
else
  clang++ -o aec AECforWebAssembly.cpp
fi
cd analogClock
../aec analogClock.aec
npx -p wabt wat2wasm analogClock.wat
if [ "$OS" = "Windows_NT" ] # https://stackoverflow.com/a/75125384/8902065
                            # https://www.reddit.com/r/bash/comments/10cip05/comment/j4h9f0x/?utm_source=share&utm_medium=web2x&context=3
then
  node_version=$(node.exe -v)
else # We are presumably running on an UNIX-like system, where storing output of some program into a variable works as expected.
  node_version=$(node -v)
fi
# "node -v" outputs version in the format "v18.12.1"
node_version=${node_version:1} # Remove 'v' at the beginning
node_version=${node_version%\.*} # Remove trailing ".*".
node_version=${node_version%\.*} # Remove trailing ".*".
node_version=$(($node_version)) # Convert the NodeJS version number from a string to an integer.
if [ $node_version -lt 11 ]
then
  echo "NodeJS version is lower than 11 (it is $node_version), you will probably run into trouble!"
fi
node analogClock

If everything is fine, the Analog Clock program should print the current time in the terminal.

Limitations

Notes for Contributors

If you don't know about compiler theory and don't want to read about such complicated stuff in English, you can read the paper about compiler theory I've published in Osječki Matematički List. It's in Croatian, and I hope it helps. (UPDATE on 23/10/2020: I've published a YouTube video in English with approximately the same content as the paper. In case your browser cannot play streamed MP4 videos, you can download the MP4 file and try opening it in VLC or something like that.)

To format the code, I generally prefer to use ClangFormat for languages that it supports (JavaScript, C++), because I find the code it produces nicer. For languages it doesn't support (HTML, CSS), I use Prettier. The code Prettier outputs isn't particularly nice, but something is better than nothing. It's also interesting that clang-format, if you supply it with code in this version of AEC and tell it to format the code, it will format the code assuming it's a C-like language, but (as far as I can tell) it won't break it. The result isn't particularly nice, but maybe some automated formatting is better than nothing.