ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
33.7k stars 2.47k forks source link

@tryImport #8025

Open marler8997 opened 3 years ago

marler8997 commented 3 years ago

Currently we have @import:

@import(comptime path: []u8) type

Adding @tryImport provides a solution to the build.zig bootstrap problem (see more details on this problem below).

@tryImport(comptime path: []u8) ?type

@tryImport works the same way as @import except if the module isn't found, it returns null rather than asserting a compile error.

This can be used in a build.zig file that requires other modules/packages that may not be available the first time build.zig is compiled:

build.zig:

const coolbuildlib = @tryImport("coolbuildlib") orlese {
    @compileError("Error: coolbuildlib is not available, please specify it by providing --pkg-begin coolbuildlib PATH --pkg-end.");
};

Here's an example where build.zig detects when coolbuildlib is missing, resolves it, then causes itself to be recompiled and reinvoked:

build.zig:

pub fn build(b: *Builder) !void {
    if (@tryImport("coolbuidlib")) |coolbuildlib| { 
        //
        // insert normal build logic using coolbuildlib
        //
    } else {
        // this causes build.zig to be recompiled and reinvoked
        const next_build = b.addRecompileBuild();
        next_build.addPackage(.{
            .name = "coolbuildlib",
            .path = findCoolBuildLib(),
        });
    }
}

What is the build.zig bootstrap problem?

The build.zig bootstrap problem occurs when build.zig requires one or more packages. The problem is that the first thing zig build needs to do before anything else is compile build.zig. If build.zig needs one or more packages before it can be compiled then there's no way build.zig can assist zig build in how to resolve/find these packages.

I've seen various solutions to address this issue such as

  1. supplemental files alongside build.zig
  2. non-zig configuration inside build.zig
  3. custom tools that need to run before zig build

These solutions have their own problems because it means build.zig is unable to fully facilitate the build like it's meant to. Users must defer to other tools to help get build.zig going. It also means that when a dependency is enabled/disabled by an option in build.zig, this option now needs to be implemented in multiple places and forwarded between the bootstrap solution and the build.zig file. @tryImport gives build.zig the power to handle this initial state of the build when its dependent packages have not been configured/setup yet.

A practical example

This example demonstrates why @tryImport excels where alternate solutions break down:

build.zig:

const std = @import("std");

pub const Ssl = enum { openssl, schannel };

pub fn build(b: *std.build.Builder) !void {
    //
    // Note that this build option affects what dependencies build.zig has.
    // If a non build.zig solution were required, then this option would need to be
    // duplicated and forwarded to build.zig properly.  With @tryImport this is no longer
    // a problem.
    //
    const ssl = b.option(Ssl, "ssl", "select the SSL backend") orelse
        (if (std.Target.current.os.tag == .windows) .schannel else .openssl);

    switch (ssl) {
        .openssl => {
            // NOTE: here we only import the opensslbuild package if
            //       build.zig was configured to use the openssl backend!
            if (@tryImport("opensslbuild")) |opensslbuild| {
                // we now have access to the opensslbuild package and can use
                // any function/data from it
                opensslbuild.addBuildStuff(b);

                // ...
                opensslbuild.linkTo(some_exe);
            } else {
                const next_build = b.addRecompileBuild();
                next_build.addPackage(.{
                    .name = "opensslbuild",
                    // an example of a theoretical package manager module in std
                    .path = join(std.package_manager.resolvePkg("openssl"), "build.zig"),
                });
            }
        },
        .schannel => {
            //...
        },
    }
}
mlarouche commented 3 years ago

It would be cool if this proposal was combined with a way to list files at comptime. It would help if you contain code that are covered in NDA (ex: game console SDK) and would be only included if the folder is actually present on disk.

BinaryWarlock commented 3 years ago

Yeah, a @listDir would be great as well as a @tryEmbedFile

ghost commented 3 years ago

Actually, it would be nice to have a systematic way of catching compile errors. That would take care of both @tryImport and @tryEmbedFile, and more besides. With metaprogramming in particular, it could be quite a lifesaver. E.g., in a recent issue I had the problem of how to implement a recursive deinit method for a generic container. The idea was to try and see if the child type has a recursive deinit method and call that, otherwise call an ordinary deinit method, or do nothing if the type has neither (or doesn't support methods at all, such as an integer). A rigorous solution to that problem would require some fairly involved compiletime introspection, but if compile errors could be recovered from, the method could be written in a very short and understandable way:

fn deinitAll(this: *@This()) void {
    for (this.items) |x|
        x.deinitAll() comptimecatch x.deinit() comptimecatch {};
    this.deinit();
}

On the other hand, this sort of thing might open an amazing can of worms...

marler8997 commented 3 years ago

@zzyxyzz I proposed this in 2019 (https://github.com/ziglang/zig/issues/3144) and realized there was already a proposal (https://github.com/ziglang/zig/issues/513). I posted my thoughts there, but it doesn't look like there's been much discussion on it.

EDIT: my thoughts as of today are that such a construct is a very powerful. I'd value the opinion of someone more familiar with Zig's semantic analyzer implementation to weigh in on the cost of such a feature. Not having this feature has forced me to find alternative solutions, all of which I've been able to do with Zig's metaprogramming, that is, until I needed @tryImport.

ghost commented 3 years ago

I proposed this in 2019 (#3144) and realized there was already a proposal (#513).

Thanks, I wasn't aware of that. Though I think your @compiles proposal is not really a duplicate of #513, but much better. I mean, doc comments? Really?

You're right that there's little you literally couldn't do without such a feature, but since duck-typing and (comptime) dynamic introspection is Zig's chosen route to generic programming, it would make many things way, way easier.

SpexGuy commented 3 years ago

Looks like C++17 standardized a similar __has_include macro. I like this tryImport formulation better because it fails the compile if the file exists but isn't valid zig, so it doesn't let you arbitrarily inspect whether files exist on the host computer and recover from failure. That said, I'm not sure that we need this feature. The bootstrapping problem as formulated above isn't very convincing to me.

The problem is that the first thing zig build needs to do before anything else is compile build.zig. If build.zig needs one or more packages before it can be compiled then there's no way build.zig can assist zig build in how to resolve/find these packages.

IMO build.zig files shouldn't have many dependencies. Their standard model for doing complex things should be to invoke other programs, which they may also have built. For example, compilation and translate-c steps invoke the zig compiler as a subprocess instead of linking with it. This should be the model that other build processes use as well. If a build script needs to do something complex like download a file or generate some code, it should (build, cache, and) invoke a separate program to do that. In addition to solving the bootstrap problem, this also makes it much less likely that the top-level build.zig process will crash or be OOM killed by the OS.

marler8997 commented 3 years ago

By not solving the bootstrap problem, we are limiting build.zig to what's available in the standard library, or the code within it's own project. This makes it unnecessarily difficult to create libraries meant to be used by build.zig files. Developers are left to add additional setup and tools in order to get the dependencies they need, or copy/paste large amounts of code between projects with the burden of keeping them in sync. build.zig files already have many dependencies, they just happen to be in the standard library, but whether or not a dependency is in the standard library should have no bearing on whether or not it should be used in a build.zig file.

IMO the job of the standard library shouldn't be to support everything that a project might need. It should provide necessary tools to enable projects to meet their own needs. @tryImport is one of those tools. I am constantly running into this bootstrap problem and I've tried many solutions. After alot of thought I knew this was going to require an addition to the language, but I was very pleased to find such a simple solution. This is a feature that "enables" other projects to stand on their own, without it, we are forcing other projects to come up with their own "ad hoc" solutions which is going to make them more brittle.

SpexGuy commented 3 years ago

I think iteratively recompiling build.zig is hardly a simple solution. It's extremely complicated even to the person writing the build.zig file, as they have to remember each time which path is running and which decls they can safely use, and they need to make sure that the task graph is set up correctly and fully on the final run. For one single function that only builds a task graph and does no other work, the programmer must track many possible instantiations with different surrounding state. If the graph is partially set up on the initial run, does that carry over at all? Or is it erased? Either way seems problematic. It also makes the build script much more difficult to debug, because you now have more than one process (and executable) participating in building the task graph. We would need a new quota for build.zig rebuilds, to prevent infinite loops of restarting build.zig.

Can you give some more examples of how this problem occurs in real code? Maybe that will help me understand your point of view.

marler8997 commented 3 years ago

The build.zig bootstrap problem is a problem whether or not you find this solution simple. This solution is no more complicated than any other solution, the difference is this one doesn't punt the problem to another tool/configuration/language to solve. Doing this only pushes the complexity elsewhere and cannibalizes build.zig.

Take a look at these 2 projects:

https://github.com/marler8997/ziget https://github.com/marler8997/zigup

ziget requires SSL, and the particular SSL library to use is configured by the user via build options along with a default being selected based on the host OS. This configuration determines which dependencies build.zig requires. For example, if we choose the bearssl backend (currently being added here: https://github.com/marler8997/ziget/pull/8), then we can leverage code from that repository to configure the build data structures. In fact @MasterQ32 has already provided functions for this (https://github.com/MasterQ32/zig-bearssl/blob/master/src/lib.zig) As it stands today, I need to copy all this configuration into the ziget repo. There are other examples of this type of thing for the other SSL backends.

Going one level deeper, the zigup repo requires ziget as a dependency which transitively means it depends on everything ziget depends on and it's build options. However, because we don't have @tryImport, zigup's build.zig cannot import any code from the ziget repository without requiring another tool to run before zig build. However, zigup's build.zig needs to import code from ziget's build in order to configure its own build. The solution I'm using for now is copy/pasting ziget's build.zig into the zigup repo.

SpexGuy commented 3 years ago

Ah, I see. So you want build.zig to be able to perform an operation that creates a build configuration file (in this case, downloading it from github), and then import that file as runnable code and use it to configure the task graph. I don't like this. If I'm considering using a library, zig build --help downloading code from the internet and executing it on my computer is not something I want. I don't think requiring that all code necessary to generate the task graph is already on my hard drive is unreasonable. When the package manager is created, it will need to handle this and make sure all dependencies are downloaded before building build.zig. It will also need to expose the build.zig files (or package.zig or whatever we settle on for package imports) to the root build.zig. But given that functionality, I don't see a use for tryImport here.

marler8997 commented 3 years ago

Ah, I see. So you want build.zig to be able to perform an operation that creates a build configuration file (in this case, downloading it from github), and then import that file as runnable code and use it to configure the task graph. Ah, I see. So you want build.zig to be able to perform an operation that creates a build configuration file (in this case, downloading it from github), and then import that file as runnable code and use it to configure the task graph.

Nope. build.zig doesn't download anything. All it does it look for the dependency, and if it can't find it, it provides an error message telling the user it can't find the dependency. It also prints a git command the user can copy and run if they so chose to download the exact repo/branch/sha that is known to work. And the key piece here is that if the build configuration does not require the package, then failure to import it is completely ignored.

Automatic downloading and managing of dependencies brings in a whole new can of worms that I will leave to the future package manager. Here we're only talking about the bootstrap problem, so we can save time by limiting our discussion to that problem alone and how to address it.

You say the project's use case here would be solved by a package manager, but if you think it through you will find it actually doesn't. This problem still exists and the rock is just getting kicked down the road. Let me explain (again, I've thought about this quite a bit so sorry for the long story).

It should be mentioned that programming dependencies are one of the most deceptively hard problems in computer science. It takes years to learn about all the problems and solutions and the caveats to those solutions. Like I've mentioned, the build.zig bootstrap problem is still a problem whether or not there is a package manager. The package manager is not a magic wand that makes the problem go away. I've enumerated the various solutions to the problem but let's go through them to see where the issues lie.

Story Time

Let's say we decide to use a "configuration file" to declare our dependencies. Taking the "ziget" project, we know were going to have to provide a way to make dependencies optional, based on the build configuration. The logic to declare these dependency conditions will need to be robust, and, the logic will need to be duplicated exactly in the build.zig file. Having to keep the two in sync is enough for some to give up, but we know we're up for the challenge. In the "ziget" project, one dependency may be required if say "openssl" or "wolfssl" is selected, and another dependency may only be required if both "bearssl" is enabled and is built for Windows. So we implement all this logic, and we're careful to keep the logic in the configuration file in sync with the logic in build.zig. By that I mean build.zig is careful only to import the dependencies based on the exact same logic that was used to enable them in our dependency configuration file. Then come to find out that another project needs to base their dependency configuration on the target's cpu architecture. So we enhance our configuration file to expose what cpu architecture we are compiling for, and we remain vigilent to carefully keep our configuration file logic in sync with our build.zig file. I'm sure discrepancies between these 2 different files will never get past code review. Then someone files an issue indicating they need to change their dependencies based on which linux distribution they are on. So we go back to the drawing board and add in support for representing linux distributions in our configuration file. We design a very clever way to represent all the possible linux distributions in the world along with every possible versioning mechanisms they use and again, we're careful to keep this logic in sync with build.zig. The PR for this new feature is only 400 lines of code, but in the PR we're met with some resistance. Someone suggests we take a step back and evaluate the scalability of this whole configuration file. They argue that there's always going to be a case that the configuration file doesn't support, so they suggest a radical idea...what if instead of a configuration file, we use Zig to generate our dependency declarations! We'll call it dependencies.zig. Then someone else chimes in and asks, what if dependencies.zig needs to import dependencies? Isn't it obvious? We just add a configuration file to declare the dependencies of dependencies.zig!

As you can see, we've entered a never-ending loop here and you may say, but wait, isn't it ridiculous to think that someone would want to use a dependency in their code that is meant to generate dependencies for other code? Well by the time we're needing dependencies in build.zig, we're already 2 levels deep since build.zig is the one configuring package dependencies for the code it is meant to configure. And it's not hard to imagine that if we needed to declare some optional dependencies on build.zig, that the code declaring these dependencies may want to use a nice library to help it do this that helps with detecting the cpu architecture or the linux distribution or features about the current Windows operating system. We're 3 levels deep here which means we are declaring dependencies on our dependency generator code for build.zig. Kicking the rock further down the road begets more rock kicking and we enter the realm of the ridiculous.

Welcome to the world of "bootstrapping". This is the same problem that programming languages and operating systems have to address. It's complicated and our brains were never evolved to comprehend the recursive nature of it. That being said, I want to stress that I have thought about this problem quite a bit, and @tryImport is by far the most elegant solution I've come up with. It's very simple and it allows the build.zig to be the last "turtle" if and when it needs to. Even if we decide that a package manager will always run before build.zig, the package manager configuration will inevitably need to be as powerful as build.zig, and itself will be able to make use of its own dependencies, which would mean it also needs some means of being bootstrapped and make use of @tryImport as well.

andrewrk commented 3 years ago

@marler8997 thanks for tackling this problem. It's clear that you've put a lot of thought into it, and have some insight that is far from obvious at first glance.

I want to share the status quo plan that @thejoshwolfe and I came up with for the package manager and how it relates to build.zig. Just to put it on the table, and then we can inspect it, and see how it fares given the problem statement in this issue/proposal.

The idea is as follows:

Alongside build.zig there is a declarative configuration file. I don't want to bikeshed the file format at this time, but for now let's call it zig-packages.json. This file lists all possible dependencies. The logic inside build.zig still has power to select, detect, and deal with optional packages. However, the point of this non-turing-complete list of the set of all packages makes it possible to construct a dependency graph of packages without invoking build.zig. Such a graph would be interesting to look at on, for example, a package repository index website. Additionally, it is possible for the package manager to optimistically download everything that might be needed- perhaps you want to take your computer away from the Internet for a while, and you want to get everything you might need, so you run zig pkg fetch.

Inside zig-packages.json there is a section for "build dependencies". Such packages are always fetched and made available to the corresponding build.zig script.

Additionally, there is a section for "package fetching plugins", which provide additional ways for the package manager to fetch zig packages. For example there might be one for ipfs, and that makes ipfs URLs work within the zig-packages.json file.

If a build.zig script tries to add a package that has not been declared in zig-packages.json, it fails. All packages must be declared ahead of time as possibly depended upon.

I considered the dependency tree of zigup and ziget and I don't see a problem with this approach. This simple system can model them just fine. zigget would declare both of the ssl options inside zig-packages.json and the build.zig logic would choose which one to activate based on user config or the default. Since the bearssl package has utility inside build.zig, it is additionally listed within zig-packages.json as a build dependency, and therefore always provided, to the build.zig script. Where's the bootstrapping problem? We don't need any fancy logic inside the configuration file.

marler8997 commented 3 years ago

Since the bearssl package has utility inside build.zig, it is additionally listed within zig-packages.json as a build dependency, and therefore always provided, to the build.zig script.

@andrewrk I don't think this is an acceptable solution. You're saying that I need to download all possible package dependencies that build.zig imports, regardless of whether or not they are being used. So if I'm building ziget with the openssl backend, I still need to download the bearssl build library along with the 80 MB zigwin32 build library used for schannel on Windows and N other build libraries that I don't need for my particular build configuration. And you're giving me no way to get around this without adding a tool that runs as a precursor to zig build and the proposed package manager. Do you find this acceptable?

EDIT: @mikdusan came up with another good example, QT. Imagine an application that can optionally build with a QT backend, and someone has packaged up QT with a nice zig build library, but the package is very large, say 5 Gigabytes. Are you saying you want to require that everyone always downloads the huge 5 Gigabyte QT package even when they aren't using it? Another good example is Android. Maybe a project want's to support android and wants to use a build library from the zig-android project which includes everything needed to build android apps. Should zig require that everyone download the zig-android project even if they never use it, and even if it is very large, multiple Gigabytes?

Also keep in mind that if we accept this behavior that all package dependencies of build.zig must be downloaded regardless of if they are needed, then this has a recursive effect. My ziget project's build.zig file depends on code from zig-bearssl. Well suppose zig-bearssl decided to add support for Android. Now suddenly my project depends on Android even though I don't even support it, and now users have to download a multi-gigabyte SDK to build my project even when they aren't even building it with bearssl!

andrewrk commented 3 years ago

You're saying that I need to download all possible package dependencies that build.zig imports, regardless of whether or not they are being used.

Yes, if you use them in the build.zig script.

So if I'm building ziget with the openssl backend, I still need to download the bearssl build library along with the 80 MB zigwin32 build library used for schannel on Windows and N other build libraries that I don't need for my particular build configuration.

In this case, it sounds like perhaps you took on a bulky dependency tree for your build.zig script. Perhaps there could be a less bulky package to accomplish the same thing, since it doesn't seem the right fit for build.zig to import.

Also, I really don't think downloading a few hundred MiB is that big of a deal, compared to language additions. I'm definitely willing to spend disk space and network access - within reason - for a simplified development environment.

And you're giving me no way to get around this without adding a tool that runs as a precursor to zig build and the proposed package manager. Do you find this acceptable?

I think we are in agreement here. I am against any such tool. The only experience I find acceptable is that you run zig build and it builds. Including fetching any dependencies.

packaged up QT with a nice zig build library

Are you trying to make a GUI build.zig configuration wizard? In this case you actually do depend on those 5 GB right? If not, why would you make QT a build dependency? Put it in the list of possible dependencies and have your build.zig logic decide whether or not it gets depended on. Possible dependencies don't necessarily get downloaded unless they are enabled in the build.zig logic.

Don't make a 5 GB package that is intended to be imported and used directly by build.zig logic.

My ziget project's build.zig file depends on code from zig-bearssl. Well suppose zig-bearssl decided to add support for Android. Now suddenly my project depends on Android even though I don't even support it, and now users have to download a multi-gigabyte SDK to build my project even when they aren't even building it with bearssl!

ziget would have a build dependency on zig-bearssl. zig-bearssl will have a possible regular dependency on an android package. zig-bearssl build.zig script will have logic that checks whether the chosen target is android, optionally enabling the dependency. When building ziget on non-android, the target given to zig-bearssl build.zig script will be non-android, and so the android package does not need to be fetched.

daurnimator commented 3 years ago

Also, I really don't think downloading a few hundred MiB is that big of a deal, compared to language additions

There are many other reasons than download size to not want to have optional dependencies around, e.g.

marler8997 commented 3 years ago

@andrewrk in the Android example it's build.zig that needs to import the Android library. So based on your package manager design, we have to add android to our list of build.zig dependencies which means we always have to download the "android build sdk", and what's worse, anyone who depends on our project will also ALWAYS need to download the android SDK, even if they don't support Android.

zig-bearssl will have a possible regular dependency on an android package. zig-bearssl build.zig script will have logic that checks whether the chosen target is android,

Making Android an optional dependency is exactly what I'm asking for but your design currently makes this impossible. Android is a dependency of build.zig (not just the application code itself), therefore by your own design it must ALWAYS be downloaded. There's no build option to disable this.

Vexu commented 3 years ago

This feature has been implemented in #8033 and alternatively in #8072, which can be used if this is accepted.

iacore commented 3 years ago

I don't think a build tool should be able to download files. See CMake. You can already run a shell script for downloading files.

Avokadoen commented 2 years ago

I would like to have this feature right now. I have a project that does some simple unit testing in a CI. I have some heavy dependencies that I do not need for tests and the obvious but sad fix right now is to make the CI pull submodules that are not even needed in order to fix missing files on import

InKryption commented 1 year ago

Idea: instead of having this return ?type, make it return type like normal, but take a second type parameter that has to be a definition, which would be returned if the named package wasn't found. Sort of like a fallback definition, so the compiler could still always resolve the import.

The proposed behavior is then also still possible, but does have some added friction:

const maybe_foo: ?type = blk: {
    const foo = @tryImport("foo", struct {});
    break :blk if (@hasDecl(foo, relevant_decl_name)) foo else null;
};

Could also just change @import so that it took an optional second parameter. I think it would be a good idea to limit the second parameter should have to be a literal struct definition, and not a value calculated at comptime, much like the import path string is restricted to being a string literal.

I'd also like to add: this would empower another use-case I've had in mind, which is the ability to make "config packages" less painful to use. Right now, if you @import a package, you force anyone using zig test or the like to pass --pkg-begin, and create an actual file to pass to it, which can be a real pain. As well, it would allow us to move away from the @import("root") pattern, which can be noisy and hard to document, and makes other libraries more susceptible to having name collisions.

KilianHanich commented 1 month ago

Are you trying to make a GUI build.zig configuration wizard? In this case you actually do depend on those 5 GB right? If not, why would you make QT a build dependency? Put it in the list of possible dependencies and have your build.zig logic decide whether or not it gets depended on. Possible dependencies don't necessarily get downloaded unless they are enabled in the build.zig logic.

You will likely always need Qt as a build dependency because of the way Qt works.

Qt has something called the "Meta Object Compiler" (a code generator which generates code based on your source code) and a few other tools which work at build time. And you will need to teach the build system how to generate code with it based upon it.

For example let's say you have a window class called MyWindow (or any QObject derived type, which means everything which is supposed to use Qt's signal and slot mechanism and a few more things). So you have the file MyWindow.hpp. You define your functions in MyWindow.cpp. The MOC will take MyWindow.hpp and generate its stuff based upon that (likely named moc_MyWindow.cpp) which then needs to be given to a C++ compiler. The reason why the MOC needs to run is that for QObject derived classes to work correctly, the first thing in the class definition must be the Q_OBJECT which declares a few member functions, mostly for runtime reflection, but also for the Signal and Slot mechanism (the Signal and Slot mechanism kinda depends on reflection and the event loop depends on the Signal and Slot mechanism).

Sure, Qt is a special case here, but a lot of GUI toolkits have some code generator which creates code based upon some interface specification file (often XML; yes, Qt also has such a tool which output then needs to be given to the MOC).

The only way to make this work would be to make these kinds of tools separate packages, but if you want these to be built from source, well, they often depend on the framework they are there for (in a way which makes itself not necessary to work, but you still end up with a cyclic reference this way).