larmel / lacc

A simple, self-hosting C compiler
MIT License
887 stars 64 forks source link

C89 code generator backend #27

Closed cesss closed 2 years ago

cesss commented 3 years ago

I think it would be great that lacc had a special backend mode that generated C89 code from the IR, and then invoked another C89 compiler for actual compilation. This would open the door of being able to compile C99 and C11 code in platforms that just have a C89 compiler, and could be even more useful in future C versions (C2x and whatever comes next).

I'm not thinking in using lacc for translating to C89, because the C89 code generated by the backend wouldn't be "comfortable to the eyes", and, besides, I guess it would be non-portable code because it comes from the IR, which in turn comes from parsing system headers (and system headers are platform-dependent, so the result would be platform-dependent C89 code).

In other words, the goal I'm thinking in is just invoking another C89 compiler, not in the usefulness of the generated C89 code apart of being adequate for being passed to another compiler running in the same system.

Another cool goal of this backend would be that it would open the door to lacc running in any platform and in any CPU (with perhaps the only requirement of defining the sizes of basic types in the same way that the target C89 compiler does: I guess that lacc should be using the same type sizes as the C89 compiler that will be invoked).

Also, I tend to believe it could be a feature useful for testing the optimizer, because the target C89 compiler could be invoked with -O0 (thus using only optimizations done by lacc) or with -O3 (checking how much room is still there for optimization).

Now the obvious question: how much work would this feature require? By looking at the code, I tend to believe it should be doable without too much effort, but I still wasn't able to understand the complete workflow and the complete IR representation, so I ask the question because you are the author and so you know in detail what this feature would require.

larmel commented 3 years ago

I think it would be pretty straight forward to emit C code. At least if you accept use of goto and other primitive constructs instead of any kind of high level readable code (but that would be no problem in a use case as you outline). The IR in lacc is not much different than a very small subset of C, so probably you can get far by just mapping each operation to the corresponding C expression.