ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
35.2k stars 2.56k forks source link

some projects use .s extension for preprocessed assembly source files, incompatible with zig's extension-based detection #20655

Open GalaxyShard opened 4 months ago

GalaxyShard commented 4 months ago

Currently, projects with .S "Assembler with C Preprocessor" files must use addCSourceFile[s] to add the files to the compilation, while regular .s assembly files can be added with either addAssemblyFile or addCSourceFile[s], which is a bit confusing.

There is also a second problem with the current design; addCSourceFile[s] has no way of specifying the language of each file, and it is determined by file extension. This means 1. files with non-standard extensions can not be compiled and 2. the only distinction between preprocessed assembly and regular assembly is uppercase/lowercase file extension, which doesn't play well with case-insensitive filesystems and operating systems (example in practice: my project zig-nds has issues with assembler preprocessor).

Proposed solution

Replace addAssemblyFile and addCSourceFile[s] with addForeignSourceFile[s], (not necessary) Add a .language field in CSourceFile[s] to specify the language of the file[s].

andrewrk commented 4 months ago

I don't agree that these proposed names are better. C is more than a source file format, it's a particular compilation model ("The C Compilation Model") that involves one file per compilation unit, include directories, the existence of the preprocessor, and "C flags".

Let's focus on solving your use case, which is the issue with preprocessed assembly files. Can you explain the problem statement in more detail?

KilianHanich commented 4 months ago

As I can read from #20630, it is planned to long term move support for languages like C more into the build system instead of the compiler.

While it goes further than what is needed here, I could see this as basically making it possible for the build system to be a generalized build system (and with it a full blown CMake and Make replacement) if the opportunity is taken to make the chosen language a plugin. (Before somebody asks, I know people who use CMake to build LaTeX documents, Powerpoint presentations (I don't know how either), but also more sane things like Java.)

So your package has a dependency on a plugin in the ZON file, you register the plugin in the build system and then you say e.g. via a .language field what kind of source files you have (and with it which plugin needs to handle it) in a addForeignSourceFiles function. Obviously some plugins can be available by default if it's wanted.

Sure, this goes quite a lot further than the problem described here, but could tackle quite a few things at once at the downside of it being quite complex (as plugins always are).

Also, it would make it possible to have different plugins for different C compilers and maybe even one which tackles C++ modules (TL;DR you can't handle a project which makes use of them internally in a Make like way, you need to dynamically parse them (or use the protocol described in P1689R5 which luckily all major C++ compilers support these days) and then build up a dependency tree). And these plugins can then move independently of each other (besides breaking changes).

But as I am entirely an outsider here, these are just my 2cent.

GalaxyShard commented 4 months ago

I don't agree that these proposed names are better. C is more than a source file format, it's a particular compilation model ("The C Compilation Model") that involves one file per compilation unit, include directories, the existence of the preprocessor, and "C flags".

Yea that makes sense, I was thinking along the lines of "FFI" rather than the "C Compilation Model".

Let's focus on solving your use case, which is the issue with preprocessed assembly files. Can you explain the problem statement in more detail?

I am trying to compile BlocksDS/libnds with Zig, and the main issue is that assembly files use lowercase .s file extensions, which Zig assumes to be "regular" assembly, without the C preprocessor. As Zig, unlike Clang/GCC, has no way to force which language a file is identified as, there is no way to compile these without changing the file extension to .S.

I made an issue about this upstream but the developers had a few important points as to why this isn't a great solution in the first place:

  1. Many developers, atleast in the Nintendo DS programming community, assume .s files do have a preprocessor (which is true on GCC/Clang with -x assembler-with-cpp).

  2. More importantly, Windows does not differentiate between uppercase/lowercase file extensions, which could easily cause confusing issues if Zig would error on .s files and be fine with .S files.

mlugg commented 4 months ago

Zig's low-level CLI usage (zig build-exe etc) does support -x assembler-with-cpp. It sounds like we ought to integrate this with the build system somehow.

andrewrk commented 4 months ago

In general, we need to transition to moving non-zig compilation units into being orchestrated by the build system for a few reasons:

KilianHanich commented 4 months ago
  • C/C++ compilation will be provided by an external package.

  • Proper parallelization. For example, currently my ffmpeg project does not start compiling its C source files until nasm is built and all the assembly files are compiled, because those objects are passed to the zig build-lib invocation. Instead, the C source files should be built in parallel to nasm and the assembly files.

  • Zig build system needs to be able to drive a system C/C++ toolchain in order to satisfy the use case of replacing existing build systems for projects that are packaged into Linux distributions.

That's actually why I brought up my point, especially because of the last point. A surprisingly high amount of system packages does not just have C, C++ and Assembly, but also code in e.g. Fortran (especially in scientific computing) or script files which belong to them (especially games like to do this).

GalaxyShard commented 4 months ago

Zig's low-level CLI usage (zig build-exe etc) does support -x assembler-with-cpp. It sounds like we ought to integrate this with the build system somehow.

Yea, I took advantage of that in my fork of Zig (#20687), which (since I updated it today) simply adds a .language field to CSourceFile and AddCSourceFilesOptions and then pushes -x to the zig_args if it is present.