ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
35.22k stars 2.57k forks source link

Stage 2 Proposal: Standardise a binary format for ZIR, and enable compilation to and from this representation #5635

Open ghost opened 4 years ago

ghost commented 4 years ago

EDITED -- original text is quoted below

There are a huge variety of hardware architectures in the world, both professionally designed and hobbyist. It would be impossible to actively support all of them in upstream Zig, however we can provide facilities for end users to support them themselves.

The simplest way to do this would be to standardise a binary representation of ZIR, add a command line option to emit this representation, and allow users to develop their own backends consuming this. To allow other language projects to leverage Zig's infrastructure, we could also add a command line option to compile from this representation to native code, and allow users to develop their own frontends as well. If we wanted to go really insane, we could provide these compilation stages as libraries, permitting tighter integration with a host or guest project.

We'd want to wait a while before attempting this, and even then we'd only want to expose a very high-level view of the compilation process, to avoid nailing things down too hard and preventing future optimisations or improvements. Once we've navigated all of that though, this could be a killer feature.

This might already be the plan, but I couldn't find any mention of it, so here's an issue.

Much like it would be impractical to provide special consideration for every hobbyist OS, we can't possibly hope to actively support every architecture that people might create. Even for the ones that we do support, maybe there's some alternative backend that produces faster, more efficient code. Let's enable users to support their own platforms how they choose

This would consist of three parts: allowing userland code to override certain private standard library functions (such as the various syscalls), allowing build.zig to register a user-defined function per compilation unit to perform code generation, and (once #1535 is implemented) somehow teaching the linker about the allowed relocations in the object format. (I won't give possible examples here, as I don't know enough about the design of the compiler to know what would make sense.)

This would require standardisation of in-memory ZIR at the ABI level, so it probably won't be feasible for some time, but in the long term I think we should definitely make it a goal.

SpexGuy commented 4 years ago

I think we could at least do this eventually:

Supporting incremental compilation through this process is a very different idea, requiring a much more complex interaction between the compiler and the code generator. I'm not sure that standardizing that interface is a good idea in the long term. IMO new backends are best represented as a fork that integrates the backend into the compiler (and hopefully gets upstreamed once it's stable). This would allow the backend to be tuned to its needs, whatever they may be.

For example, a backend may need a specialized data structure to be built to accelerate code generation, but the most efficient place to generate that data structure is in the parser. If the backend is part of the compiler, the parser can be modified to check which backend is in use and generate the needed structure. Without that, we need to perform an extra processing step in the backend to rebuild data that was available when the parser ran. This increases both complexity and run time.

We should certainly keep an updated document containing a list of things that need to change in the compiler in order to add a new backend, and we should devote design effort to keeping this list small and to keep it from changing often. But I don't really see much value in being able to add a backend from a program's build.zig, or even in having a strict binary separation between the backend and the rest of the compiler.

ghost commented 4 years ago

Yes, you're probably right.

ghost commented 4 years ago

Updated to be more realistic.