ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
34.75k stars 2.54k forks source link

In/OutStream design flaw introduced with new error set changes. #764

Closed Hejsil closed 1 year ago

Hejsil commented 6 years ago

So, before error sets, we had In/OutStreams that could be implemented on a memory buffer, files or whatever would fit this interface, and could function to take In/OutStreams, and it would work on all of these.

Now, with error sets, I can see that In/Outstream was redesigned to be generic, taking an error set as the parameter. This ruins the useability of the In/OutStream:

error: expected type '&OutStream(error)', found '&OutStream(error{SystemResources,OperationAborted,IoPending,BrokenPipe,Unexpected,WouldBlock,FileClosed,DestinationAddressRequired,DiskQuota,FileTooBig,InputOutput,NoSpaceLeft,AccessDenied,})'

Now, you would have to pass the error type to all functions taking an In/OutStream, for it to work again:

fn foo(comptime Error: type, stream: &io.OutStream(Error)) !void { ... }

If this is the way they should now be used, then can I be sure that foo doesn't explode into a million instances of foo, as different OutStreams are used? Users of my function now also have to call it like this:

const file_stream = FileOutStream.init(&file);
foo(@typeOf(file_stream).Error, &file_stream.stream);

Which, idk, seems not so ergonomic for the users of my library. It relies on the fact, that the implementer of the OutStream adaptor was kind enough to actually provide this Error constant field. If the implementer did not do this, the user is more screwed, and would have to write this code:

const file_stream = FileOutStream.init(&file);
foo(@typeOf(file_stream.stream.writeFn).ReturnType.ErrorSet, &file_stream.stream);

Ooh god, plz no.

andrewrk commented 6 years ago

Thanks for the early adoption. Let's get this sorted out.

doesn't explode into a million instances of foo, as different OutStreams are used?

This aspect should be fine. This isn't c++ where we duplicate all the work in every file and then rely on the linker to toss out redundant definitions. LLVM actually has an optimization pass that merges compatible function definitions together. We could potentially even notice when the difference between instantiations differs only in error sets and save time and space even in debug mode.

The API ergonomics and semantics on the other hand, let's talk about that.

Here are our options:

Sorry about all the breaking changes in a small amount of time. I've been trying to do the breaking changes all together and sooner to avoid even more pain later.

Hejsil commented 6 years ago

Actually, the code explosion is more a compile speed concern. foo could be big, and we might want to switch between loggers or something. EDIT: Ops, I guess I didn't read the last part about no difference in error sets optimization. That probably solves this.

Anyways, I think the ergonomics aspect is the bigger concern. I don't know the exact solution to this either, which is why I didn't propose anything. Something similar to var is probably what we want, where we can keep all our error code intact. Sadly, var throws away a lot of readability from the function signature. I don't want to propose some complicated "Concepts" feature, just so I can say foo takes any OutStream.

Maybe #470 is what would solve all of this? fn foo(stream: &OutStream(var)) !void, though I have no idea as to how this would work, as OutStream technically is a function being invoked at comptime to figure out the type.

tiehuis commented 6 years ago

Came across this a few weeks ago when doing a deflate implementation. https://github.com/tiehuis/zig-deflate/blob/master/deflate.zig#L99

This particular case was a bit more troublesome than simply taking a var however since I needed to also generate a wrapped type and store in the new case. Couldn't think of a way at the time to get the comptime Error from the type from a specific instance.

tgschultz commented 5 years ago

I was all set to begin overhauling std with comptime interfaces (#1829) when it occurred to me that, of all the interfaces I know about in std, only InStream and OutStream have any real reason to be comptime, by virtue of being the only ones that return dynamic error sets.

The ergonomics of both creating and using interfaces could be a lot cleaner if we didn't have to deal with that, so I'd like to try and gather some more suggestions and opinions on how we could handle errors with InStream and OutStream.

One thing that was not suggested was to have functions that actually care about the error set information take it as a parameter along with the stream interface: pub fn func(stream: *InStream, comptime InError: type) InError!void

andrewrk commented 5 years ago

One thing that was not suggested was to have functions that actually care about the error set information take it as a parameter along with the stream interface:

I think that's worth trying. Wouldn't it look like this?

pub fn func(comptime InError: type, stream: *InStream(InError)) InError!void

I got bit by this problem recently as well:

https://github.com/ziglang/zig/blob/0afb86868423b454708608f9faa9bf27ce3233f3/std/debug/index.zig#L1092-L1093

You can see I gave up and used anyerror. But I think passing a comptime Error type is a good thing to try.

Then there's still code bloat to solve - this is a really gnarly problem. I certainly agree with the title of this issue that this is a design flaw.

tgschultz commented 5 years ago
I think that's worth trying. Wouldn't it look like this?

pub fn func(comptime InError: type, stream: *InStream(InError)) InError!void

That's a way of doing it. I was thinking that func would @errSetCast from the anyerror returned by InStream's function pointers. One reason for it to be done like that is that a runtime-dispatched function would have to take *InStream(anyerror) in your example, but anyone calling that function would probably have to @ptrCast to get their InStream(SomeErrorSet) in there.

I suppose I'll experiment and see which is more of a headache to rewrite std with.

andrewrk commented 5 years ago

I just realized that @Hejsil already wrote all this up in the original description back in February.

tgschultz commented 5 years ago

So something I've been asking myself is, why do we even care about the ErrorSet in the context of interfaces?

The function taking an interface for generic purposes can't possibly cover all possible error cases, so would always end up punting on the ones it didn't know about when it was written anyway. Presumably someone up the call stack might care about handling all the possible errors, but they would already have the ErrorSet since they have the implementation, right? They'd just have to cast to it.

So I guess what I'm saying is, what is the real problem if all interfaces were to return anyerror? The only thing I can really think of is that the functions called on the interface may add their own errors to the implementation's set (as InStream.readByte() adds EndOfStream), and there's no way for the caller to know that. That could be solved with multiple returns and comptime fields (see: #208).

andrewrk commented 5 years ago

So I guess what I'm saying is, what is the real problem if all interfaces were to return anyerror?

That might be a worthwhile tradeoff, given the downsides of status quo. But to answer your question the real problem is that you can no longer switch on the error, and have the compiler make sure that you handled all the possible errors, even when the code is later changed so that more error codes are possible, and similarly the compiler would no longer detect dead code for handling an impossible error code. To summarize, it makes it more difficult to write code where it's clear that you've handled all the possibilities. You'll probably have to do a @errorCast which incurs safety-checked undefined behavior - a clear step down from compile-error protection.

Hejsil commented 5 years ago

Another option. Have interfaces with custom errors have a castToErrorSuperSet function, which @bitCast the struct into a new interface, whose interface is a superset of the original.

const E = error{A, B};
const stream = &OutStream(E).stream;

// Validates that the errorset passed in is a superset.
const stream2 = stream.castToErrorSuperSet(error{A,B,C});
tgschultz commented 5 years ago

I think with the interface pattern I'm trying right now this won't be a big deal. Most code will want to take the comptime interface (retrieved as my_stream.inStreamInterface()) as var which will come with full ErrorSet information and inference. If the stream needs to be taken and/or stored as an abstract (type-erased, runtime) interface (retrieved as my_stream.inStream()), then everything will be anyerror because there really isn't any alternative, and you have the option of also taking the ErrorSet as a parameter.

andrewrk commented 1 year ago

I think this can be considered solved by #17344. Any further discussion is welcome.