ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
34.47k stars 2.52k forks source link

`zig cc`, `zig c++`, `zig translate-c` and other subcommands without a clang/llvm dependency in the compiler binary #20875

Open andrewrk opened 2 months ago

andrewrk commented 2 months ago

This solves the last remaining blocker for #16270, which was proving tricky to find a satisfactory solution to.

What users experience right now is these subcommands being provided via a single binary distribution available from ziglang.org/dowload, as a zip or tarball.

Without a compiled version of LLVM and Clang inside the same executable as Zig, the equation shifts. Some proposed solutions are:

We can imagine combinations of these approaches as well, such as supporting prefetched packages in the zig installation directory and providing downloads that contain clang already prefetched in it.

The fundamental conflict comes down to:

Having typed this out, I think the solution is clear: both!

A separately maintained project should depend on Zig, Clang, and LLVM, and implement the extra utility on top of those dependencies. ziglang.org will continue to post only zig binaries; those binaries will no longer link against Clang and LLVM. This separate project which may or may not have its own website will post binaries whose invocations will look like zig-extras cc, zig-extras c++, zig-extras translate-c. The project could provide a strict superset of subcommands, making it a drop-in replacement for zig that supports those extra commands. I'm sure at least two dozen geniuses will suggest to name this project "zag".

Meanwhile, serious people will be 100% fine with the zig binary that does not depend on LLVM and Clang because they will have full access to these features via the all-powerful build system, and they already know that zig cc is a silly gimmick that is completely unnecessary for real world projects. Even so, those subcommands will work with this LLVM-less build of zig, because they will use the build system to fetch and cache the package that provides the necessary functionality.

alexrp commented 2 months ago

Potentially crazy/silly idea: What if, when invoking zig <subcommand> and <subcommand> is not known to the compiler, the compiler searches a standard location on the file system (let's just say $HOME/.zig for the sake of argument) for a file named <subcommand>/zig-<subcommand>(.exe) and simply executes it with the arguments after <subcommand>?

Some advantages I see:

One downside is the potential for people to misunderstand such a 'plugin command' as being part of Zig proper. (Anecdotally, though, I have not seen this to be the case in the .NET world, where dotnet <subcommand> works similarly if the .NET CLI doesn't recognize <subcommand>; people seem to understand well enough that they're invoking a third-party tool because they actually had to take the step to install it with dotnet tool install -g <name>.)

nektro commented 2 months ago

i feel that has no benefit given zig cc vs zig-cc is the same amount of characters and would not only add complexity to the zig binary but also give the false impression that a command is by ZSF. this pathway also creates a name squatting race i dont think we should go down. im aware git and cargo and others do this but i dont think we should as well.

edit: it also steals that namespace away from the official zig binary making it harder for ZSF to add proper subcommands in the future

alexrp commented 2 months ago

i feel that has no benefit given zig cc vs zig-cc is the same amount of characters

I can already see myself mistyping zig cc and getting an unknown command: cc error countless times in the future because my muscle memory is what it is now. Same for zig-extras cc or zag cc or whatever.

this pathway also creates a name squatting race i dont think we should go down.

Is there any evidence that name squatting has actually been a problem in these ecosystems? Also, note that we would probably not have a dotnet tool install equivalent in Zig world, as we don't have a central package repository. (Or if we did, you'd at least have to point it to the specific place you want to install from.) So if some plugin falls out of favor, people can simply stop using it and a new one can take its place with the same command name, perhaps even with CLI compatibility.

also give the false impression that a command is by ZSF.

I don't think this would be a real problem. You'd have to manually install the plugin command, making it quite obvious that it has nothing to do with ZSF. My experience in .NET world is also that this confusion doesn't exist there, even with dotnet tool install. (I think that in itself is remarkable, because .NET developers have a rather annoying tendency to rely way too much on Microsoft to provide support for everything. So you'd expect this confusion to exist in .NET world, if it exists anywhere.)

im aware git and cargo and others do this but i dont think we should as well.

.NET as well (just edited my post to note that).

it also steals that namespace away from the official zig binary making it harder for ZSF to add proper subcommands in the future

This I'll concede is a valid concern, though. My best suggestion off the top of my head would be that we should culturally encourage people to not pick too generic names for their plugin commands unless it's actually warranted. Zig could also reserve some command names that we reasonably expect it to gain in the future.

ikskuh commented 2 months ago

does this issue affect the ability of zig build-exe to compile/link C code?

If so, how's the plan with mixed-code modules? They are currently such a killer feature and i'd cry if they go away.

Ashet OS uses them extensively to compile and link larger C libraries like lwIP and it's really nice to build modular mixed-code projects

I was considering making SDL.zig use the same methods, so people just need to link a sdl2 module and get a ready-to-compile system.

floooh commented 2 months ago

How does this idea of a separate zig-cc toolchain affect Zig projects which don't need need zig cc or zig translate-c as command line features, but only need to build C libraries and maybe run a TranslateC step in the build system? Will this be possible with the 'vanilla' zig toolchain's package management or do I need to tell my users to use the zig-cc toolchain instead? (which frankly wouldn't be great)...

I was thinking of something more modular, but maybe you already thought about that and discarded the idea for a reason:

  1. A Github project which is a vanilla Clang toolchain bundled with the same cross-compilation headers and libraries that are currently bundled in the Zig toolchain. This Github project wouldn't have any Zig parts in it. It would in spirit be similar to the wasi-sdk (https://github.com/WebAssembly/wasi-sdk) but instead being specialized for WASI it would be a general cross-compilation solution for C/C++/ObjC projects (same as zig cc is now). This project could be used in the same way that people are currently using the zig cc feature for, but instead of zig cc they would simply run clang - the important part is the bundled cross-compilation sysroot headers and libraries. I would think that this project could create new releases fully automatically via Github CI on new Clang/MUSL/mingw2 releases, e.g. ideally it would require very little manual intervention once the automatic test-and-release CI pipeline is setup.

  2. A Zig package zig-clang which integrates the above cross-platform Clang toolchain project with the Zig build system. It would add the TranslateC step, a ClangCompileStep (for building C/C++/ObjC code into libraries) and maybe some sort of LLVMCompileStep (for using the LLVM optimizer and code generation backend instead of Zig's) to the Zig build system.

This would mean that I could just add this zig-clang package as dependency to my build.zig.zon, and I could compile C libraries and run TranslateC with only the vanilla Zig toolchain installed.

Does that make sense?

PS: as a side-effect of such an endavour I'm hoping that this would also provide a blueprint for better integration of the Emscripten SDK and the above mentioned WASI SDK into the Zig build system by 3rd-parties (since those SDKs are just slightly differently specialized Clang toolchains).

rohlem commented 2 months ago

Meanwhile, serious people will be 100% fine with the zig binary that does not depend on LLVM and Clang because they will have full access to these features via the all-powerful build system [...]

How does this idea of a separate zig-cc toolchain affect Zig projects which don't need need zig cc or zig translate-c as command line features, but only need to build C libraries and maybe run a TranslateC step in the build system?

@floooh I understand this to mean that the solution proposed in https://github.com/ziglang/zig/issues/16270#issuecomment-1616115039 is still the way forward. The zig build system includes the zig package manager, which will provide a way to provide C features via a clang/LLVM (or other C compiler integration) package. To me option 2. in your comment reads like it describes the same approach.

floooh commented 2 months ago

Yeah somehow my brain skipped this part:

...because they will have full access to these features via the all-powerful build system...

...tbh though, if a self-contained cross-platform Clang toolchain (with bundled cross-compilation headers and libraries) exists as 'base', I wonder how useful an additional zig cc toolchain built on top of that even would be (as opposed to just running Clang from that new 'integrated cross-platform toolchain' directly. The only missing feature would be being able to run translate-c directly on the command line.

i11010520 commented 2 months ago

Take v8 for example, v8 building depends on one specific version clang (detailed to commit hash number) that be included in its source tree. How could zig cc/c++ replace clang in such scenario?

alexrp commented 2 months ago

How could zig cc/c++ replace clang in such scenario?

It doesn't sound like this would work even today, and quite intentionally so on the part of the V8 developers? It was also never a goal of zig cc to perfectly imitate an exact version of Clang in such a way.

i11010520 commented 2 months ago

It was also never a goal of zig cc to perfectly imitate an exact version of Clang in such a way.

There should be many scenarios that need specific exact version of Clang, right?

Or say, how to promise the features set supporting between versions of zig cc and clang?

ikskuh commented 2 months ago

There should be many scenarios that need specific exact version of Clang, right?

Hopefully not, otherwise your code is horribly broken beyond repair

floooh commented 2 months ago

Hopefully not, otherwise your code is horribly broken beyond repair

There are quite a few reasons to pin the toolchain to a specific version, especially in bigger projects and projects which need deterministic builds (I think the pinning to any version is the important part for projects like V8 - so that the same version of V8 is always built with the same version of Clang, what specific Clang version that is should mostly be irrelevant).

But I don't see a reason why this pinning shouldn't be possible with a Clang wrapper toolchain or a Zig package which wraps a Clang toolchain (FWIW, the Zig package manager doesn't even have a 'non-pinning mode').

ikskuh commented 2 months ago

There are quite a few reasons to pin the toolchain to a specific version, especially in bigger projects and projects which need deterministic builds

Yeah, true, i forgot about deterministic and reproducible builds. But it doesn't mean the code requires a specific version of clang/llvm/compiler to be compiled correctly, but as you said, it needs any fixed/hermetic to create reproducible build results