jnz / q3vm

Q3VM - Single file (vm.c) bytecode virtual machine/interpreter for C-language input
GNU General Public License v2.0
840 stars 59 forks source link

RFC: Build tinycc or LLVM backend #11

Open SirJson opened 6 years ago

SirJson commented 6 years ago

The C in RFC I mean literally.

I want to improve the C compiler since for me writing C89 Code can be confusing at times. And we have to remember the special licence for LCC. So I was looking into solutions for that problem after all q3asm doesn't know too many ASM commands anyways.

So I came with two solutions:

  1. Write a code generator for tinycc. The project still seems to be active and the licencing would fit better to the project. Also the tiny in the Title already suggests that this is like LCC a small and simple compiler. The bad part is it's a compiler that skips the assembler output stage. Means I don't know if the project would be a good fit for outputting Q3VM assembler.

  2. Write an LLVM backend. That would theoretically improve language support dramatically. In theory we would also get a lot of optimizations for free but I'm not sure about it. The bad part woule be the fact that writing an LLVM backend doesn't look like it would be a cake walk.

So the question is: What do you think would make the most sense in the context of this project?

EDIT: I should never ever write something on my phone first.

HarryR commented 6 years ago

There's also:

Another option would be a WebAssembly to QVM translator - which would be interesting as you would get LLVM and every other WebAssembly target for free.

Whereas a LLVM backend would probably take a lot more work and would be a significant chunk of code that must be built before anything can be compiled to QVM

SirJson commented 6 years ago

ShivyC looks very promising for me especially since I really started love Python. That's something I will definitely look into. Thanks!

... looking at ShivyC more closely also reveals that it's not quite finished yet. There seems to be a lot of missing features like:

Implement function definition

or

Full extern/static/global variable implementation

I would also miss typedefs to be honest.

I feel like there is still a lot of work to do before this compiler could output something that we can use even though setting something up in Python would be way faster. I guess 8cc is next best fit.

jnz commented 6 years ago

I agree. LCC is not my favourite compiler either. I've been told that this could be a helpful project: https://pp.ipd.kit.edu/firm/index.html

HarryR commented 6 years ago

What about The Amsterdam Compiler Kit ?

Although LCC is nice for having a relatively small chunk of code which does the 'C compiler' bit - as is 8cc, or even the Plan9 compilers?

HarryR commented 6 years ago

Ah, there's also this - https://github.com/rswier/swieros/blob/master/root/bin/c.c

But, it emits opcodes directly rather than going through an assembler.

SirJson commented 6 years ago

I think we should continue the discusion from https://github.com/SirJson/rq3vm/issues/2 here:

The licence looks worse than the original.

The authors of this software are Christopher W. Fraser, David R. Hanson, and Jacob Navia.

Christopher Fraser and David Hanson wrote the original lcc compiler. Jacob Navia has added several enhancements to the sources, and wrote all the rest of the compiler system: assembler, linker, resource compiler, editor, debugger, etc.

The copyright is then shared between two holders:

  1. The original lcc compiler's copyright (Addison Wesley)
  2. The compiler system/lcc enhancements copyright (Jacob Navia)

Permission to use, and modify this software for any purpose, subject to the provisions described below, without fee is hereby granted, provided that this entire notice is included in all copies of any software that is or includes a copy or modification of this software and in all copies of the supporting documentation for such software.

THIS SOFTWARE IS BEING PROVIDED "AS IS", WITHOUT ANY EXPRESS OR IMPLIED WARRANTY. IN PARTICULAR, NEITHER THE AUTHORS NOR AT&T MAKE ANY REPRESENTATION OR WARRANTY OF ANY KIND CONCERNING THE MERCHANTABILITY OF THIS SOFTWARE OR ITS FITNESS FOR ANY PARTICULAR PURPOSE.

lcc is not public-domain software, shareware, and it is not protected by a `copyleft' agreement, like the code from the Free Software Foundation.

lcc is available free for your personal research and instructional use under the `fair use' provisions of the copyright law.

You may not sell lcc or any product derived from it in which it is a significant part of the value of the product. Using the lcc front end to build a C syntax checker is an example of this kind of product.

Note for CDROM vendors: Including lcc-win32 in a CD needs explicit agreement from the authors. You may NOT include lcc-win32 in a CDROM or in other products that will be sold without explicit agreement from the authors.

Programs compiled with lcc-win32 remain the property of their authors. If they are sold, a professional copy of lcc-win32 should be bought. All this copyright statement applies only to lcc-win32, NOT to the programs compiled with it.

For more information concerning the licenses for the system contact

Jacob Navia Logiciels/Informatique 41 rue Maurice Ravel 93430 Villetaneuse France Tel: 33-1-48-23-51-44


Chris Fraser / cwfraser@microsoft.com David Hanson / drh@microsoft.com Jacob Navia / jacob@jacob.remcomp.fr

After some experiments I have to say:

lccwin32_test.zip

jnz commented 6 years ago

Thanks for the detective work. But without the compiler source this is a dead end. To quote John Carmack: „The tools necessary for building mods will all be freely available: a modified version of LCC and a new program called q3asm.“

I am not sure what exactly has been modified. But the double data type stuff that I‘ve mentioned in another issue is part of the change. Q3 LCC does never emit 8 byte op codes.

jnz commented 6 years ago

Now in the other thread Harry was against looking at every word of the license and just go for it. But I still think it is worth the time to briefly look at the text:

lcc is available free for your personal research and instructional use under the `fair use' provisions of the copyright law. You may not sell lcc or any product derived from it in which it is a significant part of the value of the product. Using the lcc front end to build a C syntax checker is an example of this kind of product.

No one wants to sell lcc. And lcc is used as a standalone compiler here. They don't want people to enhance their own products with lcc code. And nowadays this doesn't matter anyway as there is LLVM. It is pretty much still the Quake 3 use case. And it was apparently ok back then with a high profile AAA game title. Writing bytecode plug-ins with lcc for a commercial platform was ok in the last 20 years.

Programs compiled with lcc-win32 remain the property of their authors. If they are sold, a professional copy of lcc-win32 should be bought.

The key word here is should. So even for lcc-win32 I wouldn't see a problem. But lcc-win32 is abandonware. The lcc github page is somewhat active and one of the original authors commited stuff to Github in 2014. https://github.com/drh/lcc

SirJson commented 6 years ago

I think the short licence on the lcc-win32 website trips him:

License: This software is not freeware, it is copyrighted by Jacob Navia. It's free for non-commercial use, if you use it professionally you have to have to buy a licence.

Professional use is:

Related to business (e.g you use it in a corporation) If you sell your software.

But really getting mad over lcc is a waste of time.

I tend to always gravitate back to LLVM as a solution. A really stupid and easy one would be taking libclang and just transpile everything to Q3VM bytecode. But then we would lose all optimizations that a real compiler would have made.

I'm still not convinced that writing an LLVM backend is so hard in the end. Sure LLVM is huge and complex but also very popular. Maybe there is enough material out there to get somewhere.

I also saw an LLVM Squirrel JIT modification the other day with an very unfortunate name. If that implementation works it would mean that getting at least JIT back would be very easy. Those guys did that by looking from the frontend side of things and they got away with just three new files.

I wonder if that code works but I first have rebase it against the latest version of Squirrel 3 because C++ changed a bit in the last 10 to 15 years. :thinking:

jnz commented 6 years ago

What I like about the Quake 3 VM is that it is so simple. You can just compile it in a few seconds and there are no dependencies. I have Q3VM running here effortless on a microcontroller (STM32F429) with just a few kilobytes of RAM. A printf hello world call (with VM_Call overhead and everything) takes 80µs. With LLVM things get better but also bigger and more complex.

jnz commented 6 years ago

But that Squirrel LLVM stuff is a good find. I've learned now that Valve is using Lua and Squirrel in their engines.

SirJson commented 6 years ago

Why would LLVM make it bigger and more complex at runtime? Sure it would be more complex to setup the compiler, but this can be automated like they do with YCM for Vim. Also who really understands lcc right now?

The goal for me is still to run the same Q3VM instructions that lcc generates inside the same Q3VM that exists right now. Otherwise I would just integrate a WebAssembly interpreter / JIT and call it a day. But that defeats the point.

Back in the days I tried Lua, Squirrel even AngelScript inside my C++ games. And after a while I always ended up using some framework or helper because binding your API can get clunky real quick.

If I would still make games that project would have been huge for me because all I wanted is something that is as close to C as possible but a scripting language + easy to integrate. As you said there is nothing easier out there right now. (Not that I know of, Mono was actually not too bad as well)

In the end C89 is not the end of the world. If i really want more modern compiler features I might follow trough with my stupid idea and see how far I can go with it or finally pickup a compiler book and learn how all of this works under the hood.

SirJson commented 6 years ago

OK I think I have enough input at least for me. My takeaways from this are:

After all this talking I to decided give it a shot by using a divide and conquer approach. I will base the compiler off tinycc and the project can be found here in the q3vm branch.

My approach will be to basically delete everything that I think won't be needed to get a clearer picture of what is going on. And if I should get some understanding I try to implement a code generator / assembler output mode in that branch.

I also don't plan on backporting upstream because I think the end result will be butchered. And help from someone with experience in writing a compiler is always welcome and will be really appreciated from me since I'm starting from almost zero.

So no guarantees that there will be any results from my side :wink:

jnz commented 6 years ago

That was badly written by me. I don't mean LLVM would make it bigger at runtime. Surely not. What I mean is that you can just git clone the whole q3vm repo and compile everything (including the compiler) in a few seconds and bring it to a new platform. So that has a certain charm.

I would love to have a LLVM backend. But my observation is that LLVM is a dynamic project, so what works now might not work in the future. That could include target definitions as well as the IR bytecode. Downloading and compiling clang + LLVM is also not exactly lightweight.

Regarding your tinycc endeavours: godspeed! Let me know if I can help you.

mingodad commented 2 years ago

How about use https://github.com/vnmakarov/mir ?