haskell / core-libraries-committee

96 stars 15 forks source link

Changing `IO a` to `IOE e a` for type-level checked exceptions in base #108

Closed brandonchinn178 closed 1 year ago

brandonchinn178 commented 1 year ago

I've been prototyping a new approach to adding checked exceptions to IO, and I think it would be a good thing to do in base. While I recognize that the approach hasn't been tried in any production-quality codebase yet, I'm opening this issue as a focus of discussion. I'm not expecting this to be resolved any time soon (although I wouldn't say no to it getting accepted).

The proof of concept is in my checked-io library. See the README for more details, including comparisons of all other libraries/approaches to solving this problem that I could find.

TL;DR:

Pros:

Cons:

Is it perfect? No. Is it fully featured? No. But I would argue it provides a lot of really useful primitives and it's a step in the right direction that can be improved + iterated further.

Original Reddit thread: https://www.reddit.com/r/haskell/comments/z8drt4/rfc_checkedio_library_for_better_exceptions_in_io/


Edited 2022-12-05: Clarified the title and description to emphasize that it's not just "merging my library", but it's proposing a change to base with a proof-of-concept implemented in my library.

re-xyr commented 1 year ago

Personally I feel the problem with checked exceptions is that sometimes we don't want everything checked: if we're going to create a checked-base where each IO-performing function is annotated with all exceptions that they could throw, many of them will just be exceptions that are actually unrecoverable or we won't care to recover.

For that reason, I feel that the direction you're going, i.e. "annotating everything", is wrong. Exception is a mechanism that can be used for many scenarios, and only for some scenarios, annotating them makes sense. Ultimately I would choose to have everything unannotated and only annotate specific functions when I know I care about these exceptions. And in this regard, lightweight checked exceptions is a better solution because it can seamlessly interface with unchecked functions (we also have impredicative types past GHC 9.2).

About addition to base: I don't think this library is a good candidate, not because of its quality, but that base has always been a small standard library that do not host advanced control structures. Case in point, none of transformers, exceptions or unliftio is in base.

brandonchinn178 commented 1 year ago

@re-xyr Thanks for your response! I'm curious what about this approach is incompatible with your desire to "keep everything unannotated". As mentioned, base would still have an unannotated IO type and associated functions; users would still be able to use all of the unannotated versions if they desire. What this proposal provides is the ability to opt-in to annotations.

Yes I could do this in a separate library (that would effectively be a fork of base) but there'd always be a risk of partiality if base happens to throw a new exception in a function. Merging this into base would give confidence that we've handled all exceptions, because everything in base would be using checked primitives.

Also, what advanced control structures are you referring to? If you're concerned about the MonadRunIOE/MonadRunAsIOE mechanism, sure, we can keep it in a helper library like unliftio, and keep base concrete to IOE. But IO already throws exceptions in base, theres nothing more advanced happening in this approach.

mitchellwrosen commented 1 year ago

How does forkIO fit into this proposal?

brandonchinn178 commented 1 year ago

I don't see why forkIO can't just be

forkIO :: IOE e () -> IOE e' ThreadId

It takes an action in which any exception can occur (which will be swallowed if not handled, just like today) and returns the ThreadId, for any e' (it doesn't seem like forkIO itself can throw any exceptions)

re-xyr commented 1 year ago

@brandonchinn178

Thanks for the response.

As mentioned, base would still have an unannotated IO type and associated functions

They are annotated with SomeSyncException. I guess this is technically "unannotated" since you propose type IO = IOE SomeSyncException, but if that's the case, I have another question: how do we combine errors in your library? Say we have

f :: MonadRunIO m => m ()
g :: MonadRunIOE AnotherException m => m ()

What is a viable concrete type of f >> g?

Merging this into base would give confidence that we've handled all exceptions, because everything in base would be using checked primitives.

I agree, but merging into base would be a very, very long term (and ambitious) goal, considering

You may want to first publish your library to Hackage for it to gain traction.

brandonchinn178 commented 1 year ago

@re-xyr Excellent questions, thanks for asking!

how do we combine errors in your library?

Up to you! Whereas base currently has no opinion on tracking exceptions at the type level or composing said exceptions on the type-level, checked-io (and with this proposal, base-2.0) implements tracking exceptions at the type level but still has no opinion on composing exceptions.

But if you look at the README, you'll notice that all of the current approaches to composing exceptions (with an asterisk by "Lightweight Exception Handling") are completely compatible with checked-io. So however you're composing exceptions today, you'll be composing them the same way still (perhaps even a little better, since there'd be an official IOE monad for you to hook into, instead of the library being forced to make their own; or if we type everything with MonadRunIOE, composing exceptions could wrap IOE in their own monad, implement MonadRunIOE for their monad, and integrate into everything for free).

Like I said before, most users will probably be using plain IO still, and would be using all the getFooIO functions, which puts everything into SomeSyncExceptions ("composing" via subtypes, same as SomeException right now). "What's the point of all this then?" Libraries. If the http-client library wanted to expose a function sendRequest :: Request -> IOE HttpException (), they'd currently have to call base functions and hope they handled all the errors appropriately. But with checked-io merged into base, http-client would have confidence that all the functions they're using from base are correctly annotated with their exceptions (and thus can handle all possible exceptions).

merging into base would be a very, very long term (and ambitious) goal

Yes, I quite agree 😅 which is why I'd like to start the conversation now. Test the waters, so to speak. If there's absolutely no chance of any of this landing in base, I'd have to figure out if it's worth the effort to flesh out checked-io knowing that it'd have to track changes in base for forever. If there's interest, then I might be persuaded to continue fleshing out checked-io.

What you're proposing includes renaming all base IO functions, which is breaking in a large scale

Not immediately. Like I said, all this work could be done completely in the background. Add all the new IOE functions as getEnvIOE, leave the existing API untouched. If there's sufficient interest, we could just leave it like this, let the ecosystem start using the new IOE API. If we end up not liking it, strip it all back out, no harm no foul. If it's a success, make the switch over (rename current getEnv -> getEnvIO, rename getEnvIOE -> getEnv, modulo better deprecation cycle)

Your library is not a de facto standard yet, and there are multiple competing libraries

For sure. This is part of what I meant by "not been tried in any production-quality codebase". In a sense, it's a bit of a chicken-and-egg. The big win of this change is libraries being able to specify a stricter API vis-a-vis exceptions, but libraries aren't going to want to add checked-io as a dependency just for this. So there'd be a whole sub-ecosystem of checked-http-client libraries built on top of checked-io, and that's much more than I'd be able to do on my own.

But my hope is that the "Alternate approaches" section in the README shows why checked-io is the best intermediate step right now, and that all other "competing libraries" either have worse performance for the same ergonomics (e.g. ExceptT-like transformers) or would have the exact same ergonomics with checked-io as base today (like I said before, perhaps even better, if base types everything with MonadRunIOE, and the other library gets to integrate for free).

Newcomers may need to learn about exceptions earlier than they need to

Maybe, maybe not. Newcomers could just use the IO versions of everything, and it'd be exactly the same as right now. And Haskell already has the issue of "Newcomers need to know monads in order to write a main function"; exceptions are already familiar to most newcomers, so it's less of a barrier than the IO monad overall.

But sure, maybe it's not a trivial barrier. IMO part of what drives people to Haskell is the type system forcing you to handle things (e.g. a function returning Maybe or Either forces you to handle all the cases). Why should exceptions be any different? In one sense, exceptions are more complicated in Haskell because they're hidden; I know it was for me.

Lysxia commented 1 year ago

however you're composing exceptions today, you'll be composing them the same way still

That doesn't work out so well for the way IO is used now, where everything might throw some sync exception, but where the upside is that you have only one IO monad to worry about. It may be true that you can implement a solution for composition on top of IOE, but you also have to commit to one as soon as IOE is used with at least two exception types, as you need to somehow sequence IOE ErrorA and IOE ErrorB. With unchecked IO we just don't need to think about that.

Although IOE gives a solution to the subproblem of "what exceptions do these IO functions throw?", even a use case as simple as "a http library throws HTTPException" seems to require a proper story for composition to be able to interact with base.

Also what annotations should base functions have? If at least two exceptions are used (say IOException vs SomeSyncException), then you run into the composition problem. Otherwise, if you choose to put everything under the same SomeSyncException. then you might as well stick to checked-io in a separate library with explicit checkIOWith calls to mediate.

Moreover MonadRunIOE has a fundep which makes it unusable for variants of IOE annotated with sums of exceptions (the examples in "Compatibility with checked-io" violate the fundep).

konsumlamm commented 1 year ago

This would be a breaking change. Typeclass instances for type aliases need an extension (which is enabled with GHC2021, but not Haskell2010), so providing an IO a alias is not backwards compatible. Most libraries would also want to implement instances for IOE directly, so they'd need to change anyway.

Turning every function that uses IO into one that uses a typeclass isn't great either, both for performance (if the function fails to specialize, you have the overhead of the typeclass dictionary) and for ergonomics (type errors would be less readable, perhaps type inference would even get worse). Only being able to specify a single exception type doesn't seem very ergonomic either. What do you do if a function can throw several exceptions? Use a typeclass? You said "however you're composing exceptions today, you'll be composing them the same way still", but I don't see how that would work.

Overall, I'm not convinced that this is the single best approach that we want in base, especially since this library is new and not battle-tested. I for one think it would be more productive to first discuss how an exception tracking system should roughly look like (and if we even want one in the first place), rather than starting with a specific implementation.

re-xyr commented 1 year ago

But if you look at the README, you'll notice that all of the current approaches to composing exceptions (with an asterisk by "Lightweight Exception Handling") are completely compatible with checked-io

I gave it a read, but these examples are implemented with IOE SomeSyncException which is IO anyway. so the situation is:

  1. Your library provided a typeclass for checked exceptions
  2. It then provided a concrete type as an instance of that typeclass, but it does not support composing errors
  3. So if users need to compose errors (which is common), they need to disregard (2) and implement instances for their favorite libraries instead

I guess then the question is: why not just provide only the typeclasses and not the IOE type, since it's not very useful?

But my hope is that the "Alternate approaches" section in the README shows why checked-io is the best intermediate step right now

This is just my personal preference, but I don't feel like that... apart from that it has an uncomposable IOE type, it also requires either a mass refactor in base or an alternative base. Compare that with lightweight checked exceptions which requires nothing to change unless you want to track specific exceptions.

I think the problem stems from that in the "alternative approaches" section in the README, it frames "a new IO monad" as a positive while I don't think it is. It says "Any of the approaches that don't provide a new IO monad will also allow other hidden exceptions to be thrown in IO", but that is exactly what I want by "only annotate specific functions when I know I care about these exceptions".

brandonchinn178 commented 1 year ago

@Lysxia

as you need to somehow sequence IOE ErrorA and IOE ErrorB. With unchecked IO we just don't need to think about that.

Like I've been saying, a new IOE e a type doesn't force users to use IOE. If someone doesn't want to think about it, just don't. It's completely opt-in. But what this would newly enable is someone being able to write

foo :: IOE ErrorA ()
bar :: IOE ErrorB ()

safe :: UIO ()
safe = do
  foo `catch` \e -> putStrLn $ "foo errored: " ++ show e
  bar `catch` \e -> putStrLn $ "bar errored: " ++ show e

and be completely confident that safe doesn't throw any sync exceptions. This isn't possible today. Sure, you could document what errors foo and bar throw and hope that the docs are up-to-date (or that the developer didn't forget they're using a function in base that could throw something else), but you don't have the same confidence.

even a use case as simple as "a http library throws HTTPException" seems to require a proper story for composition to be able to interact with base

Again, base and other libraries would be responsible for providing an IO and an IOE version of their functions. So normal users wouldn't need to worry about composition.

But composition isn't required for some use cases. If you're writing a database client using a C library, you can write

sendQuery :: Query -> IOE DatabaseError Response
sendQuery q = do
  resp <- db_send (serialize q)
  case deserialize resp of
    Left e -> throw e
    Right r -> pure r

foreign import ccall "db_send" db_send :: ByteString -> UIO ByteString

and be confident that sendQuery doesn't throw any other exceptions.

But I would argue that the http library's HTTPException should contain the specific exceptions it expects to rethrow from base.

Moreover MonadRunIOE has a fundep which makes it unusable for variants of IOE annotated with sums of exceptions

Aha, that's the kind of comment I was looking for. You are, indeed, correct. I'll have to think more about that. I've updated the README with code snippets that actually work. Now, error composition would require a wrapper (e.g. toPlucked :: ProjectError e' e => IOE e a -> IOE e' a), but it could still reuse IOE, which is the whole point.


@konsumlamm

so providing an IO a alias is not backwards compatible

Thank you! I forgot about that. I wonder if it could be a newtype around IOE then, and it would still have the same MonadRunIOE SomeSyncException instance.

Turning every function that uses IO into one that uses a typeclass isn't great either, both for performance [...] and for ergonomics

Right again. Wouldn't adding SPECIALIZE for UIO, IOE, and IO help with the performance?

What do you do if a function can throw several exceptions?

What do you do today? Like I've been saying, users will primarily be using IO in their main function. So what you do today (throwing several exceptions = lifting into SomeException) would still be what's happening here. But libraries will be able to specify the specific error they want to throw. e.g. an http library would have an HTTPException type that would contain all exceptions it could throw including ones it can rethrow from dependent functions. I would argue this is a good thing; library authors would be empowered to show users what specifically went wrong in the context of the library instead of generic errors like "file not found".

But as a shortcut, one could also just wrap it

data MyException
  = MyException1
  | MyException2
  | OtherException SomeException

I for one think it would be more productive to first discuss how an exception tracking system should roughly look like (and if we even want one in the first place), rather than starting with a specific implementation.

I renamed the issue + updated the description, if that helps. My goal isn't necessarily "merge my library". My intention was "I think a good exception tracking system would be one that extends the IO type with an extra exception type parameter, and I made a proof of concept here." But sure, maybe that's still too specific for you (I had thought it was uncontroversial to want an exception tracking system, but apparently I was wrong).


@re-xyr

It then provided a concrete type as an instance of that typeclass, but it does not support composing errors

That's not true, IOE supports composing errors. I've just updated the README with working code samples, take a look at e.g. the "plucky" example.

it also requires either a mass refactor in base or an alternative base. Compare that with lightweight checked exceptions which requires nothing to change unless you want to track specific exceptions.

So my (apparently incorrect) assumption was that people would generally want the option of stricter types (even if most of the time, people would use the allow-any-exceptions-in-IO type). In this world, a mass refactor of base is inherent to the proposal. You can always derive a looser API from a stricter one, the opposite direction is risky. So assuming people would want the option of stricter types at all, it would need to happen at the foundation level.

The whole point of a "stricter API" is that it captures all the exceptions. So the problem with "Lightweight Exception Handling" is that it has no mechanism for saying "I lack exceptions", or "I am guaranteed to never throw exception X".

but that is exactly what I want by "only annotate specific functions when I know I care about these exceptions".

Ok sure, that's useful data. I was under the impression that Haskell devs prefer being aware of handling all the cases. To me, that's the big thing we tell newcomers, that if you want to call a function returning Maybe, you have to either handle the Nothing or propagate the Maybe. Either way, you're forced to handle every case. Now, we have things like do-notation or combinators like fmap to make it more ergonomic, but the fact that the type indicates something you're forced to handle is a big deal when you're learning Haskell. I don't know why exceptions in IO is any different.

konsumlamm commented 1 year ago

Right again. Wouldn't adding SPECIALIZE for UIO, IOE, and IO help with the performance?

Yes, but you'd have to add such SPECIALIZE functions for every "IO function" (and not only in base, for the user-defined functions too), which isn't great either.

What do you do if a function can throw several exceptions?

What do you do today?

Today I just write a function that returns IO a. But if base were to get a new type for exception tracking, it would be pretty weird if that type only supported a single exception or SomeException. Yes, you could write some sum type (or use some Either type) that combines all possible exception, but that requires boilerplate and isn't really ergonomic imo (both to define and to use).

But sure, maybe that's still too specific for you (I had thought it was uncontroversial to want an exception tracking system, but apparently I was wrong).

My point is that there's more than one way to have an exception tracking system. I'm not against an exception tracking system per se, but Haskell might simply not have the tools to implement it ergonomically. Either way, we'd probably want a way to still have unchecked exceptions, for the reasons @re-xyr mentioned:

Personally I feel the problem with checked exceptions is that sometimes we don't want everything checked: if we're going to create a checked-base where each IO-performing function is annotated with all exceptions that they could throw, many of them will just be exceptions that are actually unrecoverable or we won't care to recover.

For example, putStrLn may throw an exception, but I highly doubt you (or anyone else) ever tried to catch that, or want to be forced to catch it.

re-xyr commented 1 year ago

I've just updated the README with working code samples, take a look at e.g. the "plucky" example.

OK, another question: there will be functions in base that throw multiple kinds of exceptions, which needs some way to be expressed in the type level. Do we just have a "blessed" EitherE type as in your example?

Besides, it seems like you plan to refactor IO functions in base entirely so that their implementations are all written in terms of the "checked" variants of each other? If that's the case, then obviously we're going to need to compose errors. Does that mean we're going to also "bless" plucky as the approach we're using in base?

In this world, a mass refactor of base is inherent to the proposal.

I'm not completely averse to the idea of adding checked functions into base either, it's just we haven't seen much real-world uses of (not only your library, but also) checked exceptions in general for it to become a vital part in our ecosystem. It would probably also be very hard to ensure the performance characteristics of these functions haven't changed. I'm not sure though, have you tried rewriting a few modules?

Also, how are you planning to structure base after this rewrite? Do the three variants of each function reside in the same module, in which case each function has unwieldy suffixes in their names? Or there are three modules for each module related to IO, in which the module count increases threefold?

I was under the impression that Haskell devs prefer being aware of handling all the cases.

I think you're interpreting this philosophy too literally. I agree that we should make people aware of possible exceptions that a function can throw, perhaps through better documentation, but making them handle (or explicitly declare not to handle at every call site) the exceptions is really unnecessary. As @konsumlamm mentioned, I'd rather not be responsible of an IOException that occurs from my call to putStrLn.

Yes, I can use the unannotated variant instead if I don't feel like handling it, but then the problem is I'm then put in a choice between all or nothing. Either I handle all the possible exceptions that can arise - whether I care about them or not - or I handle none of them.

Or maybe we can make meticulous decisions about what exceptions are those that users (library developers?) care about and what aren't. But that wouldn't fly either, because different users have different ideas about what exceptions they want to handle!

Bodigrim commented 1 year ago

@brandonchinn178 we strive to keep the list of open issues actionable. Happy to host free flowing discussions, but at some point they must transfigure into proposals. What's your plan about this thread? Would you like to take specific steps to obtain more feedback? Is there a specific ask for CLC to approve? Or shall we close the issue as dormant and return to it in a year or two (e. g., when GHC2021 will be more widespread)?

hasufell commented 1 year ago

I'm using "checked exceptions" in GHCup via Excepts.

There are many more approaches like plucky, oops and others.

I don't think any of this is sufficiently designed or explored yet to land anywhere near base. We need much more "data". I don't think base is the place to explore this. Yes, it may mean someone has to fork base and create an alternative prelude. That's ok.

brandonchinn178 commented 1 year ago

Feel free to close for now