Proposal: Forbid inferred error return set on `pub fn`

ikskuh commented 3 years ago

I was thinking about how to make a nicely maintainable library and one thing was pretty obvious:

Function signatures must be stable between minor releases, including error sets. This means that using a inferred error set in a public facing API is a code smell if not a design error.

Thus, i propose to forbid inferred error sets on pub fn declarations.

This code would create the following error:

fn foo() !void {
    return error.Nope;
}

pub fn bar() !void {
    try foo();
}

pub fn baz() error{Nope}!void {
    try foo();
}

file.zig:5:14: error: public facing function must have a non-inferred error set
pub fn bar() !void {
             ^

This would make it way easier to write a maintainable library in Zig as you don't have to read the whole source to figure out what errors are in a error set.

It also makes it harder to add new errors to a public facing error set.

Another benefit of this is that code analysis tools can compute a diff for the public facing API, as not a full semantic analysis of the source has to be done to figure out what an error set is.

daurnimator commented 3 years ago

While in some cases this will be annoying (as the errors thrown in a complex function can often be dependant on comptime inputs; so developers will have to write SomeError(input) functions, see e.g. std.json.ParseError); I think it would be worth it.

Luukdegram commented 3 years ago

I'm all for this change. To me, this has benefits to multiple layers of users within a codebase.

The author gets a nice compile error for adding an error that is not yet part of the error set, making it very clear they're breaking the public API.
Library users can tell what errors can be expected without going through layers of nested functions to find out all possible errors. *
- It's also much easier for the library user to tell the API was changed, as the library either received a minor/major version update (if the author is a good person) or if the error set changed.
Project members that are calling into internal 'packages' that aren't necessarily part of a library public-facing API, also get the benefit as to tell what errors are to be expected from an internal API.

However, I do expect to see people use a single error set for their entire API. Though I'm a strong believer this is something where we should educate the programmer to create separate error sets where it makes sense.

batiati commented 3 years ago

For libs, I agree it's a good design. But for application code, it is a lot easier to just use inferred errors. See for example how popular the Anyhow crate is in the Rust ecosystem.

I think it could be implemented in some lint tool.

InKryption commented 3 years ago

I am a big fan of this; the only point I could think of against this would be that you can already find out what errors are possibly returned by a function by catching the error, putting it in a switch statement without any cases, and letting the compiler tell you what you missed. But that's arguably far more tedious if you're trying to wrap another library, or some other undertaking.

g-w1 commented 3 years ago

I have an amendment to this proposal that might address @batiati's concerns:

Only implement this for sub-packages of the compilation. Do not implement this for the application code.

This would allow functions in application code that are in different files not have to go by this, but any packages that are used must have a full error set.

As std is a package, this would apply to any file in std, just not in the code that is being built-exed.

InKryption commented 3 years ago

@g-w1 Given that, would we also want some way to ask the compiler to tell us if a package will compile as root vs as a package to root? Or just leave it to pull requests to amend non-compliant package code?

leecannon commented 3 years ago

How would that apply to something like const lib_test = b.addTest("src/lib.zig");, does it count as a sub-package or not?

I'm not sure about the distinction between application and non-application code, seems rather arbitrary.

ikskuh commented 3 years ago

My idea to solve the package internal things would've been a protected fn thingy or something like that, but i'm not sure if that's worth the complexity.

But I don't like that pub fn foo() !void would be illegal to call from outside the package, but legal from the inside. i don't think this helps maintainability and is just an ugly hack.

Imho it's totally okay to enforce this on any function

rohlem commented 3 years ago

For me, this proposal lies exactly in the sweet spot of "would be nice if all code read like this", but also "would be annoying while writing code" (see the many discussions on "soft errors" and "sloppy mode" #3320).

However, pub is used for access across files, not packages. While it would apply to public interfaces, it could even more so lead to longer files, even if they could sensibly be split up.

The build system has a quite standardized understanding of packages: The specified root file's struct (and exported C functions, no error sets there). I assume the package manager will use the same definition. With a way to distinguish inferred error sets from @typeInfo, we could implement those rules there today, without a need for a language change.

Support from tooling would alleviate the main concerns though. F.e., add a mode to zig fmt ~~(or, I daren't say, the compiler?)~~ that infers and fills in error sets of pub fn. (Printing the actual error set in correct syntax would already be pretty nice, if you have a halfway-decent terminal that allows copying. ~~Boohoo me old cmd.~~)

haze commented 3 years ago

@batiati I despise the anyhow crate in favor of thiserror from the same author (which provides a more concrete error interface).

WRT this proposal, 👍🏽

ghost commented 3 years ago

I question the assumption that this would improve average code quality. It's true that libraries should avoid inferred error sets in their public interfaces. But elsewhere, inferred error sets are part of what makes error handling so low-friction in Zig. To get some numbers, I did a quick grep of the Zig repo and found that there are about two uses of inferred error sets for every explicit one, and this holds true both in private and public functions. This is not an unpopular or rarely used feature.

So my gut feeling is that this proposal would at best annoy people and at worst discourage proper error handling. I would much prefer if it remained a best-practice recommendation, rather than being enforced on all internal code.

Jarred-Sumner commented 3 years ago

This sounds like a problem better solved through tooling than compiler errors.

Tooling like zls should make it easy to find out what the inferred error return set resolves to, so that developers know what errors to handle without having to read all the source code.

Forbidding inferred error return sets on pub fn makes errors harder to use – which will make developers not want to use Zig errors as much (and overall probably reduce satisfaction a tiny amount)

leopoldek commented 3 years ago

Isn't this already possible with comptime code? You can use reflection to iterate over your API and check if it has the correct errors. This is what I'm doing for the library I'm writing. I'm even going a step farther and making sure it conforms to other constraints I've placed on the API (like naming convention, etc...)

BanchouBoo commented 1 year ago

I feel like this will just encourage people to use anyerror in a lot of situations.

kuon commented 1 year ago

I think it would be too much a limitation if pub would have this meaning.

I am more in favor of leaving this to tooling and documentation.

In some situations, errors are more like error messages, in other situations they are more specific error in a set (like errno). I think both uses are valid and when used like error messages, it makes little sense to force enumerating them.

ziglang / zig

Proposal: Forbid inferred error return set on `pub fn` #9752