ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
34.52k stars 2.52k forks source link

build system terminology update: package, project, module, dependency #14307

Open andrewrk opened 1 year ago

andrewrk commented 1 year ago

Status Quo

The Problem

We have been using the word "package" already, including in build system APIs:

https://github.com/ziglang/zig/blob/d813cef42af43b499fc5dc465e34f873008aaad0/lib/std/build.zig#L1479-L1483

https://github.com/ziglang/zig/blob/d813cef42af43b499fc5dc465e34f873008aaad0/lib/std/build/LibExeObjStep.zig#L977-L980

However, people are 100% going to say "package" when they mean "project". This is unavoidable, inescapable, and we should not fight against it, but rather embrace it. That leaves us with the following lingo:

Ambiguity strikes ruthlessly.

The Solution

Our hero is the word "module". The word "project" is no longer needed.

I would also like to rename LibExeObjStep to CompilationArtifact while we're at it.

This issue is to do the following things:

kuon commented 1 year ago

Just to clarify, would addPackage become addModule as now module are what is being @imported?

andrewrk commented 1 year ago

Yes, addPackage will become addModule.

kuon commented 1 year ago

Ok. I think this is a good naming, but this leads to one question.

As addModule is not aware of the package it comes from, how do you plan to handle name clash?

If I understand it correctly, a package like png could have a compress and decompress module, which would be imported with @import("compress"); (very generic name) and not @import("png").compress ?

andrewrk commented 1 year ago

@kuon I believe this question is answered by #14278. I have just now edited it for the terminology update.

kuon commented 1 year ago

Ok, so this in the app build.zig:

// "clap" here is the string used when calling b.addModule (see code snippet below)
const clap_module = clap_dep.module("clap");

// "clap" here is the name used for `@import` within src/main.zig
exe.addModule("clap", clap_module);

would be in my example:

const png_compress_mod = png_dep.module("compress");
exe.addModule("png:compress", png_compress_mod);

And then in my wherever_I_use_png.zig

const png_compress = @import("png:compress"); // name defined in `build.zig` and specific to the app

Here I use : but it could be whatever your heart desire. Maybe a convention for it?

andrewrk commented 1 year ago

Oops, I got the comments backwards (and therefore so did you from copy-pasting). It should be:

// "clap" here is the string used when calling b.addModule (see code snippet below)
const clap_module = clap_dep.module("clap");

// "clap" here is the name used for `@import` within src/main.zig
exe.addModule("clap", clap_module);

But then you got it right with your png example :+1:

kuon commented 1 year ago

Yeah I was confused by the comment, but your former explanation was clear enough for me to get it right. I have updated my post to reflect the fixed code.

Now that I get a full circle understanding of the concepts, I fully agree with your terminology.

Btw, I am happy you are not introducing a special name like crate for rust. It always confuses me, as rust has crate, package, workspace, module and namespace. Effectively reducing that to package and module is a nice simplification.

mlugg commented 1 year ago

What would the thing currently called Module in the compiler become? I know it doesn't matter too much since that's more of an implementation detail, but I can't think of a great term for it (although Module was itself never a great name for it anyway imo)

marler8997 commented 1 year ago

What would the thing currently called Module in the compiler become? I know it doesn't matter too much since that's more of an implementation detail, but I can't think of a great term for it (although Module was itself never a great name for it anyway imo)

Maybe the term "file" is good enough? @import("foo.zig") imports a "file", whereas @import("foo") imports a module, which, is also importing a file just through an alias.

I notice some ambiguity with regards to "root source file". The term root could mean what @import("root") returns, but, that's different from the "root source file" for each module ("module" from new terminology). Maybe instead of every module having a "root source file", we call it the "module index file", and leave the term "root" for the file at the root of the "module tree"?

Definitely going to need to retrain my brain on the terms module/package for Zig :)

P.S. if we pick a good term it could be used to standardize on a filename, i.e. index.zig instead of what we have now with lib.zig, root.zig, main.zig, etc

mlugg commented 1 year ago

Maybe the term "file" is good enough?

No, that's not what the term means - the compiler has a type Module which basically represents a single compilation of Zig code (so you have one Module to compile any amount of Zig code, and none if you're only compiling C/C++). It is only an implementation detail, it's just something that should be renamed if we overload the term "module" for the language itself to prevent confusion.

silversquirl commented 1 year ago

Unit or similar could be a decent term, since it represents a Zig compilation unit

marler8997 commented 1 year ago

No, that's not what the term means

It's possible the term "module" (current terminology) is currently used to represent a single compilation of Zig code I haven't seen it used that way. Is this from looking at the codebase or from a comment somewhere? Every instance of the term "module" I've seen is that it's referring to a file. i.e. "importing a module", the "builtin" module, the "root" module, these are all single files that get combined into a single compilation. The term module also appears in the Zig docs in 2 places that refer to it being a file, i.e. ModuleScope which is "file scope".

silversquirl commented 1 year ago

mlugg is talking about an internal datastructure used in the compiler, that's not related to the informal usages of the word "module" you're mentioning

(edit: it's also not related to the documentation's usage of "module scope", which should probably be reworded for clarity)

marler8997 commented 1 year ago

Oh I see src/Module.zig, yes that's definitely confusing (i.e. this conversation as evidence!) Your suggestion of something like Unit could clear up this ambiguity.

mlugg commented 1 year ago

Oh yeah, Unit isn't bad - maybe it could be ZigUnit (or Zunit (zoo-nit) if we want to be fun) for clarity

andrewrk commented 1 year ago

With this naming convention, technically "the zig standard library" should instead be called "the zig standard module".

rohlem commented 1 year ago

EDIT: I realized since the issue is labeled "build system terminology", and this is more of a general naming question, it's probably off-topic, but it doesn't seem important enough for its own issue either.

Question about the new module term: Can/Should I call something module (in code and documentation) which acts like it and is structured like it, or only once it can be imported with module-import syntax import("foo") as opposed to relative-file-import syntax import("foo.zig")?

For example, the Zig standard (library) module has, besides the root source file lib/std/std.zig, a nested structure of folders with their own "nested root" file, f.e. lib/std/zig and lib/std/zig.zig. Are/Were these subdirectories considered "modules":

If it's more of an organizational term, would it be incorrect to keep referring to non-exported source files as modules /submodules of the parent module they are used in / provide functionality for?

andrewrk commented 1 year ago

Please allow me to complete the intern-pool-2 branch before renaming Module (from src/Module.zig) to ZigUnit.

sagehane commented 1 year ago

In this model, what would be the name/treatment of files that are neither Zig source files (covered by modules), headers (tbh, I forgot how these are currently handled too, std.Build.LibExeObjStep.installHeader?) nor artifacts?

I'm trying to compile some stuff for a RISC-V32 platform that requires a custom linker script and I wasn't sure how I could advertise that from the dependency's side.

tisonkun commented 1 year ago

Instead of a manually specify "root source file", it is possible we use a convention-over-configuration strategy to pin the filename as lib.zig or mod.zig?

melroy89 commented 1 year ago

Could somebody add the addModule to the documentation with an example? Since this is fully missing. Eg.

const some_module = b.createModule(.{
    .source_file = .{ .path = "deps/zig-module/src/main.zig" },
});
exe.addModule("zig-module", some_module);

Also explain when or why to use installArtifact to the documentation.

And also add an example of the addRunArtifact to the documentation.. (and explain when/when to use)?:

const run_cmd = b.addRunArtifact(exe);
run_cmd.step.dependOn(b.getInstallStep());

const run_step = b.step("run", "Run the app");
run_step.dependOn(&run_cmd.step);
mlugg commented 9 months ago

Adding as a comment here for clarity/tracking: it has been decided to rename the compiler's internal Module type (as in src/Module.zig) to Zcu, standing for "Zig Compilation Unit". Similarly, instances of this type should be renamed from mod to zcu.

The transition to these new names began in fe87bae. New code should follow these naming schemes where possible (importing Module.zig under the name Zcu).

notcancername commented 9 months ago

Would this cause confusion with the concept of a compilation unit in C?

mlugg commented 9 months ago

The use of the term "compilation unit" is intentional because it's referring to exactly the same concept: a Zcu represents a collection of Zig sources being compiled into a single object file. The difference compared to C is that in the latter, each source file generates a separate object, and hence each C source file is a separate compilation unit.

slonik-az commented 9 months ago

and hence each C source file is a separate compilation unit.

Sorry to nitpick, but C pre-processor #include macro can muddy the waters and merge several *.c files into one compilation unit. Famous examples being "amalgamated builds" where the entire code-base is processed as a single CU, SQLite is famous for it.

Tythos commented 8 months ago

The isomorphism that exists within Python is one of the more outstanding and adoption-friendly features. Specifically, a file is a module and a folder of modules is a package. Do not underestimate the utility of defining a relationship between file structures and logical scope; its absence is by far one of the weakest features of the C family. (For what it's worth, I avoid subpackages--eg subfolders within a package--but understand there are usecases where they make sense. Identifying build artifacts exclusively at the top level is great and a feature of zig build scripts, but the dependency-to-import process remains needlessly onerous.

slonik-az commented 8 months ago

Specifically, a file is a module and a folder of modules is a package.

Not quite. In Python, package is a directory with __init__.py file in it. This is analogous to Zig module's root.