Closed Hejsil closed 1 year ago
Thanks for the early adoption. Let's get this sorted out.
doesn't explode into a million instances of foo, as different OutStreams are used?
This aspect should be fine. This isn't c++ where we duplicate all the work in every file and then rely on the linker to toss out redundant definitions. LLVM actually has an optimization pass that merges compatible function definitions together. We could potentially even notice when the difference between instantiations differs only in error sets and save time and space even in debug mode.
The API ergonomics and semantics on the other hand, let's talk about that.
Here are our options:
error
as the error set, the void *
equivalent of error sets. It then "poisons" other error sets, because the union of any error set and error
is error
. (This would be the equivalent to how it was before error sets merge)error{StreamingFailed}
which is the error set for writeFn
. This bubbles up to std.io.getStdOut()
and you would not be able to distinguish between, for example error.BrokenPipe
and error.NoSpaceLeft
. It wouldn't make sense to hard code these errors into the OutStreamError set because you might make for example an OutStream on a memory buffer and then the only possible error should be OutOfMemory.var
as the type of the outstream parameter in your example like this: https://github.com/zig-lang/zig/blob/2c697e50db36e4f9c0af8b1a9d357deeb8f80026/std/debug/index.zig#L143Sorry about all the breaking changes in a small amount of time. I've been trying to do the breaking changes all together and sooner to avoid even more pain later.
Actually, the code explosion is more a compile speed concern. foo
could be big, and we might want to switch between loggers or something. EDIT: Ops, I guess I didn't read the last part about no difference in error sets optimization. That probably solves this.
Anyways, I think the ergonomics aspect is the bigger concern. I don't know the exact solution to this either, which is why I didn't propose anything. Something similar to var
is probably what we want, where we can keep all our error code intact. Sadly, var
throws away a lot of readability from the function signature. I don't want to propose some complicated "Concepts" feature, just so I can say foo
takes any OutStream.
Maybe #470 is what would solve all of this? fn foo(stream: &OutStream(var)) !void
, though I have no idea as to how this would work, as OutStream technically is a function being invoked at comptime to figure out the type.
Came across this a few weeks ago when doing a deflate implementation. https://github.com/tiehuis/zig-deflate/blob/master/deflate.zig#L99
This particular case was a bit more troublesome than simply taking a var however since I needed to also generate a wrapped type and store in the new case. Couldn't think of a way at the time to get the comptime Error
from the type from a specific instance.
I was all set to begin overhauling std with comptime interfaces (#1829) when it occurred to me that, of all the interfaces I know about in std, only InStream and OutStream have any real reason to be comptime, by virtue of being the only ones that return dynamic error sets.
The ergonomics of both creating and using interfaces could be a lot cleaner if we didn't have to deal with that, so I'd like to try and gather some more suggestions and opinions on how we could handle errors with InStream and OutStream.
One thing that was not suggested was to have functions that actually care about the error set information take it as a parameter along with the stream interface:
pub fn func(stream: *InStream, comptime InError: type) InError!void
One thing that was not suggested was to have functions that actually care about the error set information take it as a parameter along with the stream interface:
I think that's worth trying. Wouldn't it look like this?
pub fn func(comptime InError: type, stream: *InStream(InError)) InError!void
I got bit by this problem recently as well:
You can see I gave up and used anyerror
. But I think passing a comptime Error type is a good thing to try.
Then there's still code bloat to solve - this is a really gnarly problem. I certainly agree with the title of this issue that this is a design flaw.
I think that's worth trying. Wouldn't it look like this?
pub fn func(comptime InError: type, stream: *InStream(InError)) InError!void
That's a way of doing it. I was thinking that func
would @errSetCast
from the anyerror
returned by InStream's function pointers. One reason for it to be done like that is that a runtime-dispatched function would have to take *InStream(anyerror)
in your example, but anyone calling that function would probably have to @ptrCast
to get their InStream(SomeErrorSet)
in there.
I suppose I'll experiment and see which is more of a headache to rewrite std with.
I just realized that @Hejsil already wrote all this up in the original description back in February.
So something I've been asking myself is, why do we even care about the ErrorSet in the context of interfaces?
The function taking an interface for generic purposes can't possibly cover all possible error cases, so would always end up punting on the ones it didn't know about when it was written anyway. Presumably someone up the call stack might care about handling all the possible errors, but they would already have the ErrorSet since they have the implementation, right? They'd just have to cast to it.
So I guess what I'm saying is, what is the real problem if all interfaces were to return anyerror
? The only thing I can really think of is that the functions called on the interface may add their own errors to the implementation's set (as InStream.readByte() adds EndOfStream), and there's no way for the caller to know that. That could be solved with multiple returns and comptime fields (see: #208).
So I guess what I'm saying is, what is the real problem if all interfaces were to return
anyerror
?
That might be a worthwhile tradeoff, given the downsides of status quo. But to answer your question the real problem is that you can no longer switch on the error, and have the compiler make sure that you handled all the possible errors, even when the code is later changed so that more error codes are possible, and similarly the compiler would no longer detect dead code for handling an impossible error code. To summarize, it makes it more difficult to write code where it's clear that you've handled all the possibilities. You'll probably have to do a @errorCast
which incurs safety-checked undefined behavior - a clear step down from compile-error protection.
Another option. Have interfaces with custom errors have a castToErrorSuperSet
function, which @bitCast
the struct into a new interface, whose interface is a superset of the original.
const E = error{A, B};
const stream = &OutStream(E).stream;
// Validates that the errorset passed in is a superset.
const stream2 = stream.castToErrorSuperSet(error{A,B,C});
I think with the interface pattern I'm trying right now this won't be a big deal. Most code will want to take the comptime interface (retrieved as my_stream.inStreamInterface()
) as var
which will come with full ErrorSet information and inference. If the stream needs to be taken and/or stored as an abstract (type-erased, runtime) interface (retrieved as my_stream.inStream()
), then everything will be anyerror
because there really isn't any alternative, and you have the option of also taking the ErrorSet as a parameter.
I think this can be considered solved by #17344. Any further discussion is welcome.
So, before error sets, we had In/OutStreams that could be implemented on a memory buffer, files or whatever would fit this interface, and could function to take In/OutStreams, and it would work on all of these.
Now, with error sets, I can see that In/Outstream was redesigned to be generic, taking an error set as the parameter. This ruins the useability of the In/OutStream:
Now, you would have to pass the error type to all functions taking an In/OutStream, for it to work again:
If this is the way they should now be used, then can I be sure that
foo
doesn't explode into a million instances offoo
, as different OutStreams are used? Users of my function now also have to call it like this:Which, idk, seems not so ergonomic for the users of my library. It relies on the fact, that the implementer of the OutStream adaptor was kind enough to actually provide this
Error
constant field. If the implementer did not do this, the user is more screwed, and would have to write this code:Ooh god, plz no.