Closed andrewrk closed 1 year ago
Ok, this isn't quite as bad as I was expecting. Though of course we'd have to provide a reference parser to be taken seriously, and if we write that in Zig, we'd have to make sure it itself has no dependencies.
Also I think removing the dot from dotbrace really would cause parsing problems. Is {}
an empty tuple or a void
constant? Is { key = "value" }
a struct-like tuple or an assignment?
Maybe I missed something but what is the main profit of using this own DSL (not supported by any tool)?
In yaml:
name: libffmpeg
version: 5.1.2
dependencies:
libz:
url: https://github.com/andrewrk/libz/archive/f0e53cc2391741034b144a2c2076ed8a9937b29b.tar.gz
hash: c9b30cffc40999d2c078ff350cbcee642970a224fe123c756d0892f876cf1aae
libmp3lame:
url: https://github.com/andrewrk/libmp3lame/archive/497568e670bfeb14ab6ef47fb6459a2251358e43.tar.gz
hash: 9ba4f49895b174a3f918d489238acbc146bd393575062b2e3be33488b688e36f
@deflock In short, no existing tool satisfies all our requirements. For more detail, see the linked PR.
YAML is too complicated and has unacceptable footguns.
@EleanorNB I've scrolled one more time and I see why json/csv do not satisfy, s-expressions in imperative language is a bit weird :) So remaining candidates are toml, yaml, own format. I see only one profit of using own format - if it will be used directly within build.zig
so you don't need additional file for declaring dependencies, but it's not possible bc of declarative reasons.
@deflock that's a nice point in favor of this proposal that I hadn't considered - similar to how in node.js you can require("foo.json")
, it could make sense to support @import("foo.zon")
in Zig which would give you comptime access to the data.
Another point in favor of this proposal over the TOML one: #14291. Note how line diffs against mirrors field (an array) would be annoying in the TOML case due to both elements being on the same line.
I think this is much better than TOML subset.
Upsides:
@import
as it is valid zig@import
, like this.{
.url = "https://github.com/andrewrk/libz/archive/f0e53cc2391741034b144a2c2076ed8a9937b29b.tar.gz",
.hash = "c9b30cffc40999d2c078ff350cbcee642970a224fe123c756d0892f876cf1aae",
}
.{
.name = "libffmpeg",
.version = "5.1.2",
.dependencies = .{
.libz = @import("libz.lulz"),
.libmp3lame = .{
.url = "https://github.com/andrewrk/libmp3lame/archive/497568e670bfeb14ab6ef47fb6459a2251358e43.tar.gz",
.hash = "9ba4f49895b174a3f918d489238acbc146bd393575062b2e3be33488b688e36f",
},
},
}
build.zig
could have a package
function which would return a struct like this, this can be used to generate the .lulz file with zig code, which leads to the next pointzig
.Downsides:
.{
but maybe that dot could be removed, even if I personally like itPlease, don't use YAML, every time I work with that, I lose my sanity over some weird thing
why do we have this
.dependencies = .{
.libz = .{
.url = "https://github.com/andrewrk/libz/archive/f0e53cc2391741034b144a2c2076ed8a9937b29b.tar.gz",
.hash = "c9b30cffc40999d2c078ff350cbcee642970a224fe123c756d0892f876cf1aae",
},
instead of this?
.dependencies = .{
.{
.name = "libz",
.url = "https://github.com/andrewrk/libz/archive/f0e53cc2391741034b144a2c2076ed8a9937b29b.tar.gz",
.hash = "c9b30cffc40999d2c078ff350cbcee642970a224fe123c756d0892f876cf1aae",
},
does the former actually work in Zig if there's no struct with a libz: Dependency
field defined in it? We don't have hashmap literals in Zig, right?
I would also like to vote against the name "lulz", as that word is often associated with mob harassment and racism.
I would also like to vote against the name "lulz", as that word is often associated with mob harassment and racism.
Alright fine, I changed the proposal to the much more conservative and obvious choice of .zon
.
Alright, enough deliberation. Moving forward with this one; rejecting #14289.
This reminds me of https://github.com/ron-rs/ron.
I don't know if the parsing would be problematic or not though, would need to investigate.
That wouldn't be a problem as ZON is a data format, and we only need .{}
(a struct/tuple) to be distinguished from a {}
(block) in Zig, as we can execute code that might break data from the block.
We don't need to execute that here. But imho, for the sake of consistency, i'd prefer .{
over {
, so i can copy-paste the code into Zig.
@thejoshwolfe: I actually prefer .libz = .{
over .{ .name = "libz"
, as the key syntax implies that the names have to be unique, due to fields have to be unique. The second syntax doesn't convey that information.
I proposed https://github.com/ziglang/zig/pull/14265#issuecomment-1380915434, this proposal alignes to my needs, more or less idea I envisioned. I upvote for them.
The only thing I dislike is build.zig.zon
. If we already have custom format called Zig Object Notation just build.zon
is better.
build.zig
and build.zon
❤️ perfect match.
It looks and feels very much like improved JSON format. Very structered and understandable, but with comments.
Zon of a gun, this is brilliant ❤️
Will comments be //
?
Another point in favor of this proposal over the TOML one: https://github.com/ziglang/zig/issues/14291. Note how line diffs against mirrors field (an array) would be annoying in the TOML case due to both elements being on the same line.
With the ini format you could have duplicate url
keys though, which would achieve the same.
@thejoshwolfe: I actually prefer
.libz = .{
over.{ .name = "libz",
as the key syntax implies that the names have to be unique, due to fields have to be unique. The second syntax doesn't convey that information.
While I agree that the syntax is nicer, I also believe that if a Zig data format is chosen, it makes sense that the schema can be described by a Zig type and that the data can be loaded directly into said type. Using the key syntax to describe a dynamic map kinda loses both of those properties
At first I thought this was a new format with a subset of features of normal Zig but as far as I understand it it's just a regular Zig file with a single struct expression that can be imported, which I like very much.
But why do we have to call it build.zig.zon
or build.zon
? It looks kind of convoluted when we could just call it packages.zig
? This could be the standard name for the file containing a struct expression with the packages listed.
build.zig.zon
or more recently build.zon
indicates to me it is a new separate format (Zig Object Notation) where I can't expect all features from Zig to work, even though that's not the case because it is in fact a normal Zig file (or did I get that wrong?).
build.zon
sounds like build.zig
; a file that builds my project, but written in "Zig Object Notation"...? Seems rather confusing.
IMO build.zig
and packages.zig
would be a much clearer and simpler naming.
If however this is actually supposed to be a new format with only a subset of features of normal Zig, why? Why do we need to create a new format when a normal Zig file could also contain this struct expression listing the packages? It seems like this way there will always be new requests to port features from normal Zig over to this format, duplicating a lot of work.
@r00ster91 I assume it's because zig build --build-file <path>
which means you can have multiple build files in the same directory.
Edit: to clarify, commenting on the presumed reason for the naming standard
To clarify, this already works today:
packages.zig
.{
.name = "libffmpeg",
.version = "5.1.2",
.dependencies = .{
.libz = .{
.url = "https://github.com/andrewrk/libz/archive/f0e53cc2391741034b144a2c2076ed8a9937b29b.tar.gz",
.hash = "c9b30cffc40999d2c078ff350cbcee642970a224fe123c756d0892f876cf1aae",
},
.libmp3lame = .{
.url = "https://github.com/andrewrk/libmp3lame/archive/497568e670bfeb14ab6ef47fb6459a2251358e43.tar.gz",
.hash = "9ba4f49895b174a3f918d489238acbc146bd393575062b2e3be33488b688e36f",
},
},
}
build.zig
const packages = @import("packages.zig");
// ...
So can't we go from here?
So, yes, you can give the packages.zig
a fancy name such as "Zig Object Notation" but it's still simply a struct and works in regular Zig so I don't see the point of giving it a different file extension.
I assume it's because zig build --build-file
which means you can have multiple build files in the same directory.
I'm not quite sure how this relates to the need of a new object notation.
I'm not quite sure how this relates to the need of a new object notation.
It doesn't, was just commenting on chosen file naming standard, which allows the zon file to be found automatically even with the build file flag + multiple build files in the same directory.
To clarify, this already works today:
packages.zig
.{ .name = "libffmpeg", .version = "5.1.2", .dependencies = .{ .libz = .{ .url = "https://github.com/andrewrk/libz/archive/f0e53cc2391741034b144a2c2076ed8a9937b29b.tar.gz", .hash = "c9b30cffc40999d2c078ff350cbcee642970a224fe123c756d0892f876cf1aae", }, .libmp3lame = .{ .url = "https://github.com/andrewrk/libmp3lame/archive/497568e670bfeb14ab6ef47fb6459a2251358e43.tar.gz", .hash = "9ba4f49895b174a3f918d489238acbc146bd393575062b2e3be33488b688e36f", }, }, }
build.zig
const packages = @import("packages.zig"); // ...
So can't we go from here?
So, yes, you can give the
packages.zig
a fancy name such as "Zig Object Notation" but it's still simply a struct and works in regular Zig so I don't see the point of giving it a different file extension.I assume it's because zig build --build-file which means you can have multiple build files in the same directory.
I'm not quite sure how this relates to the need of a new object notation.
Why not having full-blown language has been answered in original post, see this https://github.com/ziglang/zig/pull/14265, and my proposal https://github.com/ziglang/zig/pull/14265#issuecomment-1380915434
Antlr4 grammar for status quo: https://github.com/cryptocode/zon-grammar (assuming initial dot, will adjust grammar as the proposal progresses)
I actually prefer
.libz = .{
over.{ .name = "libz"
, as the key syntax implies that the names have to be unique, due to fields have to be unique. The second syntax doesn't convey that information.
Do keys support "unusual" characters?
.lib-z@2.x = .{
.url = "",
}
.{
.name = "lib-z@2.x",
.url = "",
}
const libz2_dep = b.dependency("lib-z@2.x", .{});
See arbitrary identifier syntax @""
:
https://ziglang.org/documentation/0.10.0/#Identifiers
Assuming this feature is part of the subset, "unusual" identifiers shouldn't be an issue.
@r00ster91
To clarify, this already works today:
const packages = @import("packages.zig");
const std = @import("std");
pub fn main() !void {
std.debug.print("name: {s}\n", .{packages.name});
}
$ stage3/bin/zig run test-zon.zig
test-zon.zig:5:46: error: root struct of file 'packages' has no member named 'name'
std.debug.print("name: {s}\n", .{packages.name});
~~~~~~~~^~~~~
Looks like a parsing bug to me. I expected not even zig fmt
to succeed on packages.zig
.
Okay, I suppose it does need to be a pub const
, at which point I'm thinking why do we even need a separate file (format) for that when we could just list our dependencies within build.zig
as pub const packages = .{};
or similar (and from there the user can still choose to put only the package list in a separate Zig file and then import that, maybe as pub const packages = @import("packages.zig").list;
), so build.zig
just has to contain a public constant called packages
and then zig build
will use that name to find the list of packages to install.
It seems like a flexible solution where the user can choose to list their dependencies in their build.zig
or in a separate file (if maybe the community decides that would be the standard way to do it).
YAML is too complicated and has unacceptable footguns.
Take it from me, I implemented a basic YAML parser just for the sake of parsing Apple's TBD files and the parser got quite complicated while only caring about machine-generated dylib stubs. FWIW I am in favour of this proposal.
Looks like a parsing bug to me. I expected not even zig fmt to succeed on packages.zig.
It's being parsed as the type expression of a top-level tuple field.
For reference and why this is better than YAML or TOML, you can read those if you are interested:
I hope we can learn from all this to ensure zon
will not have those problems.
Okay, I suppose it does need to be a
pub const
, at which point I'm thinking why do we even need a separate file (format) for that when we could just list our dependencies withinbuild.zig
aspub const packages = .{};
or similar [...]
This has been discussed a couple of times: the issue is that we want to be able to resolve transitive dependencies without actually performing semantic analysis on build.zig. This is for performance reasons: if we depend on package foo, and foo depends on bar, we want to start fetching bar as soon as possible. If we use zon, we can simply grab this file, parse it, and begin any necessary downloads; if we use a decl in build.zig, we need to actually run the compiler on that file, which could get complex.
With this being said, I'm not exactly sure how major of an issue that would be in practice. Semantic analysis is fast, especially with build.zig files generally being pretty small, and we obviously don't need to go through the actual codegen stage for this. The only case where this seems like it would matter is really big dependency hierarchies, and those will probably saturate the network connection with downloads fairly quickly anyway. However, I could be missing some other advantage of knowing the data without running Zig.
what a magnificent first paragraph :star_struck:
@r00ster91
quoting myself:
There is a lot of value to be gained in declarative information for certain things. In particular, it's much faster to fetch all transitive dependencies if you can learn what network URLs to go download, or the file system paths to check the cache for, without running a compiler.
Furthermore, third party tools such as package indexes want to understand information about packages without executing untrusted code. Even first-party tools such as zig itself will use this declarative information to provide package management tooling.
The manifest file describes the set of potential dependencies, and the build.zig logic can still pick and choose which ones to use depending on build configuration.
@andrewrk
Regarding speed of fetching transitive dependencies, would that really be a major time sink in any practical use case? build.zig scripts are small and quick to compile, and we only need to go through semantic analysis before we have the info required from that decl. A moment extra before fetching dependencies only really matters for huge dep trees, where you'll quickly end up bottlenecked by the network anyway.
Regarding sandboxing, wouldn't that be solved by #14286?
Consider the following two scenarios:
Repeat until all dependecies fetched.
Scenario 2 is unacceptable, not only from a performance perspective, but also because of:
It's more complicated to do it without a declarative file, and gives worse results. There's no reason to do it whatsoever.
I used to work on NuGet, .NET's package manager. NuGet originally put the entire dependency tree in its packages.config
file. This was a huge mistake. We later migrated to a new format, PackageReference
, where you only list your project's direct dependencies. This migration made a significant difference in developer satisfaction (measured through HaTS).
more complicated logic for fetching
Very true, but, you're offloading this complexity to the developer. .NET developers complained that packages.config
was unmanageable. It was difficult to tell if a dependency could be safely removed. Also, you'd need to understand your dependency tree if you wanted to update a single dependency. Excellent IDE tooling was not enough to offset this pain.
not only from a performance perspective
Performance was a top concern for NuGet's migration: NuGet runs in every .NET build, including Visual Studio's design time builds that power Intellisense. We were able to make this fast by:
@loic-sharma, perhaps your comment was intended for this issue: #14309
At first I thought this was a new format with a subset of features of normal Zig but as far as I understand it it's just a regular Zig file with a single struct expression that can be imported, which I like very much. But why do we have to call it
build.zig.zon
orbuild.zon
? It looks kind of convoluted when we could just call itpackages.zig
? This could be the standard name for the file containing a struct expression with the packages listed.build.zig.zon
or more recentlybuild.zon
indicates to me it is a new separate format (Zig Object Notation) where I can't expect all features from Zig to work, even though that's not the case because it is in fact a normal Zig file (or did I get that wrong?).
build.zon
sounds likebuild.zig
; a file that builds my project, but written in "Zig Object Notation"...? Seems rather confusing.IMO
build.zig
andpackages.zig
would be a much clearer and simpler naming.If however this is actually supposed to be a new format with only a subset of features of normal Zig, why? Why do we need to create a new format when a normal Zig file could also contain this struct expression listing the packages? It seems like this way there will always be new requests to port features from normal Zig over to this format, duplicating a lot of work.
great!
YAML is too complicated and has unacceptable footguns.
json5 is the best.
From https://github.com/ziglang/zig/pull/14265:
- json5 exists but a lot of tooling does not expect it. and what's the extension? .json5 or .json? either one is a bit problematic.
I think it is pretty clear/decided that we will prefer to use some form of zig to define package informations (and not toml, yaml...), but there are a few way to do that, and I think this is what has to be decided now.
For the extension, I prefer just .zon
because in my mind, dual extensions is for templates. foo.html.heex
means an html template written in elixir templating language. build.zig.zon
would mean a build.zig
file will be generated when running the zon
file. But if we use zon
we have to justify the differences with normal zig. I mentioned rust object notation (ron) which is quite different from rust. But zig being simple, maybe we do not need to subset it.
The other possibility we did not discuss is to simply have package
function in the build.zig
file, which would return a datastructure like the one proposed. Of course this requires the sandboxing to be implemented. While I mention this, I am not in favor of it, as it prevents automation like zig package add <name>
which would edit the .zon
file.
I'll try to summarize the options:
package()
function in build.zig
that return the package information (approach used by elixir
mix.exs
).zon
file, which can be .zig.zon
or just .zon
, but we have to "justify" the use of a subset of thing and define it (if we have a sandbox, there is little reason to use a subset)..zig
file which would be executed on its own returning the datastructureAlso, the definitive format of this data structure has to be decided, there is #14309 and https://github.com/ziglang/zig/issues/14290#issuecomment-1381285930
Hello. I'm just a random traveller passing through, but I dare to wonder: are you trying to reinvent setup.py
? If this is indeed your plan, then please make it as restrictive, boring and to-the-ground experience as possible, because if sky is the limit during build then you can be sure some guy will reach the orbit.
@kuon this file is for declarative information that needs to be accessible without building and running Zig code. That's why it's a new format and not just a Zig file.
I can't find the comment where andrew explains why this is needed, but the summary is that we want to be able to fetch all package dependencies without requiring Zig execution, both because building and running Zig code is slow and would slow down the recursive fetch significantly, and because other tools may want to deal with package dependencies without needing a dependency on the Zig compiler.
EDIT: oh, found the comment, it was in this issue :) https://github.com/ziglang/zig/issues/14290#issuecomment-1382604540
@silversquirl Yes you are right. But I mentioned this for completeness. I am also in favor of zon
but we have to be sure that it is justified.
Now that I think we have found a good reason for introducing this new file format Zig Object Notation and not using the compiler for this, we can probably talk more about the file naming.
As for the current zig.build.zon
, I agree we should shorten it to just .zon
. Currently to me it means it's a complement to build.zig.zon
that is written in ZON, which makes sense, but I think we can improve this and make it clear what actually is in this file.
I suggest packages.zon
which just makes more sense to me as it is not the Zig Build Object Notation or such but simply ZON meaning it can be used for multiple purposes.
So packages.zon
means it lists our packages in an object notation called ZON.
I think that's better than status quo build.zig.zon
which can be ambiguous. Does it mean it's build.zig
but written in ZON? No. Is it a complement to build.zig
? Yes, but what it contains is still not clear.
I think this stack-up of file extensions is prone to confusion.
So I think packages.zon
would be the most obvious naming and I think it matches the new build system terminology proposed in #14307, too.
Also, it's one character shorter.
An advantage of build.zig.zon is that the file would always show up next to build.zig in listings sorted by file name.
package.zon for defining package and direct (top-level) dependencies, package.lock.zon for fixed transitive (non-direct) dependencies, generated and maintained by package manager only.
Let's reuse knowledge of other package managers (npm as most advanced) which already faced with all kind of subtle bugs and errors.
@likern we shouldn't blindly copy other package managers. Looking at their decisions and why they were made is a good idea, but package managers like cargo, npm, etc. were not designed for Zig and differ quite a lot from Zig's package manager.
Possibly cursed question:
If .zon has the leading .{
, should that imply that it could support a leading type annotation? Maybe in a .zton
file? (Zig Typed Object Notation). Nice implications for if you need to do polymorphic serialization (thinking about use cases totally unrelated to packaging).
@michaelbartnett I think for that you'd use a union-style .{ .foo = .{ ... } }
or .{ .bar = .{ ... } }
Extracted from #14265.
Zig Object Notation
Instead of...
It would look like this:
It also should be renamed with the
.zon
extension, and furthermore,[package]
should be renamed to[project]
to match the correct terminology.Perhaps the
.
in Zig language itself should be removed, and then it could be like this:which is pretty unoffensive if you ask me. Lots of people hate that dot anyway. I don't know if the parsing would be problematic or not though, would need to investigate.
See competing proposal:
14289