haskell-servant / servant

Servat is a Haskell DSL for describing, serving, querying, mocking, documenting web applications and more!
https://docs.servant.dev/
1.83k stars 413 forks source link

State of Servant authentication #1484

Open tchoutri opened 2 years ago

tchoutri commented 2 years ago

I took over the maintenance after a lot of help from @alpmestan when I needed to figure out servant-auth. I share the same experience as others of wasting countless hours, but I did put a lot of those hours back so that it's a tiny bit better nowadays.

Here's a short overview of incremental changes with the most impact:

Commonly requested features:

Last but not least, the plan is to port servant-auth to servant and deprecate the old auth: https://github.com/haskell-servant/servant/issues/805

These days I'm in brutal prioritization mode of scaling my business, so I can't do much maintenance but I do want to do everything I can for someone else to step up!

Originally posted by @domenkozar in https://github.com/haskell-servant/servant-auth/issues/195#issuecomment-909959489

Siprj commented 2 years ago

Having a nicer/simpler way of writing custom authentication handlers in servant-server is really compelling to me. In my daily job we had to create custom authentication combinator because the default one wasn't able to set cookies and the one from servant-auth didn't suit us either.

I would find useful some kind of combinator that would take the request as argument and returned user-defined data with the possibility to set cookies/header to the response.

alpmestan commented 2 years ago

For a long time now, both servant contributors and users have been annoyed by the fact that auth support has been essentially spread between the "generalized auth" machinery in the core servant packages, and the servant-auth packages. During all that time, I have been thinking that we ought to merge those two things (but lacked time and opportunity to execute), in a very precise way.

What I think we would all love is the following: a single Auth (or AuthProtect or whatever) combinator that people could throw in front of one or more routes in their API types to auth-protect the said routes. We ideally should be able to specify one or more auth schemes to use to protect the routes, with (again ideally) a bunch of common ones being supported and implemented out of the box, while retaining the generalized auth's ability to define new ones, possibly using all sorts of pieces/bricks used to define the pre-implemented ones. All schemes (or even combinaison of schemes) would give users a "hook" of sorts to check credentials/tokens/etc (what we refer to as "auth check" in the current generalized auth business).

I do not have a concrete design in mind, but those requirements seem reasonable and in fact are bound to dictate a saner solution that the current ones. I'm not sure whether it's better to start from scratch, from servant-auth or from the generalized auth stuffs. I'd perhaps lean towards looking at what's missing from the generalized auth stuffs besides... a bunch of pre-implemented pieces for various auth schemes, and a way to combine them, since it's already extensible.

(It's harder to introduce extensibility after the fact to servant-auth than to tweak the generalized auth stuffs a bit, IMO.)

tchoutri commented 2 years ago

Here are the aspects to take into account when implementing this:

  1. How are fetched the auth information (token, cookie, etc)
    • This must be provided by our package, be canonical implementations, and should cover a variety of protocols.
  2. Auth check: How is the token verified.
    • This is given to us by the user.
  3. How to handle success and failure of authentication
    • Possibility of having a lenient/strict mode, where the request is denied if it fails in strict mode, and in lenient mode the handler is given a Maybe a so that authentication failure can be handled at the application level.
  4. Login endpoint for third-party auth, can be an afterthought, not a hard requirement at the moment.
  5. What should be the type representing an authenticated user, what the handler gets when a user is identified.

Currently, Generalised Authentication combines 1 & 2, and asks the user to it all by themselves. We could provide 1 and leave 2 to the user.

tchoutri commented 2 years ago

Here is a prototype design that would be nice to have

{-
## Example

type API
  =  Authenticate User (AuthCookie :> AuthHeader "X-Scrive-XToken")
  :> ToServantApi API'

type API
  = Authenticate (Maybe User) (AuthHeaders '["Token1", "Token2"])
  :> ToServantApi API'
-}

-- | These empty types can be provided by us or by external packages
data AuthCookie

data AuthHeader (name :: Symbol)

data AuthHeaders '[(name :: Symbol)]

-- | This class is provided by us and implemented by the providers of the above types (us, external packages)
class HasAuth input output | input -> output -- Implement this for the auth markers
  authenticate :: (MonadIO m) => a -> m b

instance HasAuth AuthCookie User
  authenticate = myFunThatTakesACookie

instance HasAuth (AuthHeader ["X-Token"]) User
  authenticate = myFunThatTakesTheContentOfTheseHeaders

instance HasAuth (AuthHeaders '["Token-1", "Token-2"]) (Maybe User) 
  authenticate = myFunThatTakesTwoHeaders
Vlix commented 2 years ago

* instance HasAuth (AuthHeader "X-Token") User where

I'd also like to help make the authenticating with servant better (see #1494 ), so I'll have a good think about this as well.

Vlix commented 2 years ago

Some brainstorming, just wrote it down, no idea if this would even compile.

-- | When the 'Lenient' mod is set, the handler will be passed the entire 'AuthResult val'
-- and the 'Context' needs a 'LenientAuthHandler'.
--
-- Otherwise, the 'Context' requires there to be a 'StrictAuthHandler' so that any failure of authentication results in a 'ServerError', and in case of success the handler will just receive the 'val'.
instance (
    HasServer api ctxs,
    HasContextEntry ctxs (AuthHandler (If (FoldLenient mods) Lenient Strict) auth val)
    ) => HasServer (Authenticate' mods auths val :> api) ctxs where
  type ServerT (Authenticate' mods auths val :> api) m =
    If (FoldLenient mods) (AuthResult val) val -> ServerT api m
  hoistServerWithContext _ pc nt s = hoistServerWithContext (Proxy @api) pc nt . s
  route _ context subserver = 
      route (Proxy @api) context $ subserver `addAuthCheck` authCheck
    where
      authCheck :: DelayedIO AuthResult
      authCheck :: withRequest $ \req -> liftIO $ do
        -- Somehow get to an AuthResult?
        authResult <- someHowCheckAuths
        case authHandler of
          LenientAuthHandler Nothing -> pure authResult
          LenientAuthHandler (Just check) ->
            maybe (pure authResult) delayedFailFatal $ check authResult
          StrictAuthHandler check ->
            either delayedFailFatal pure $ check authResult 

      authHandler :: AuthHandler (If (FoldLenient mods) Lenient Strict) auth val
      authHandler = getContextEntry context

data AuthHandler (mod :: k) auth val where
  LenientAuthHandler :: HasAuth auth val => Maybe (auth -> Maybe ServerError) -> AuthHandler Lenient auth val
  StrictAuthHandler :: HasAuth auth val => (auth -> Either ServerError val) -> AuthHandler Strict auth val
divarvel commented 2 years ago

Hi, speaking as a happy generalized auth user, i've never used servant-auth, and never had the need to come up with custom combinators, so here's my question:

is the generalized auth lacking expressivity in a way that makes it not suitable as a basic building block? (i've used it to build token-based auth and session auth mechanism without being limited (in servant-server).

If the generalized auth is indeed enough, would it make sense to provide high-level schemes based on it? I'd argue that the biggest issue with generalized auth right now is the lack of documentation and the Experimental module namespace.

To be completely honest, there is one limitation with the generalized auth system, in servant-client: there is no way to have effectful request signing. I've tried patching it, but that was a bit too big of a change for me to carry out.

Based on my experience (both on APIs and regular web apps), generalized auth is a solid base to build upon. DX could be improved with common schemes handled out of the box, and helpers abstracting away some common use-cases, that would make building on top of it less boilerplate-y

tchoutri commented 2 years ago

From what I am reading, it would seem that the optimal direction would be to iterate massively over generalised auth, get it out of Experimental, provide use-cases like:

On servant server

On servant client

And then we can deprecate servant-auth and subsume its functionalities, provide migration guides (very important), and provide more documentation and guides with the new generalised auth schemes.

Indeed, it feels very "servant" to provide a maximum of informations at the type-level, but sometimes they are not actionable without a typeclass that provides term-level code. It would seem more efficient and direct to take a direction where we encode what we can at the type-level, and simplify API authoring for our end-users by doing the meat of things in the term level.

alpmestan commented 2 years ago

As stated before, I agree that generalized auth is a good starting point, that we need to tweak to make some of the improvements that we've discussed so far (separate "getting auth data" from "verify auth data", ability to easily combine several schemes, allow effectful things on the client side, maybe some others), while providing more schemes out of the box, along with their building blocks so that people can reuse them.

Vlix commented 2 years ago

I might be a bit too new to servant to immediately get this, but what is this "generalized auth" you all are talking about? It sounds like it's a different library from the way people mention it?

alpmestan commented 2 years ago

https://docs.servant.dev/en/stable/tutorial/Authentication.html#generalized-authentication

It's the auth machinery that ships with servant. Whereas servant-auth defines its own, and came with a different take on the problem domain.

Vlix commented 2 years ago

Ah, it is labeled as EXPERIMENTAL everywhere, which is why I didn't pay much attention to it. :thinking:

Vlix commented 2 years ago

I've been playing around and have something that technically only needs a Request -> Handler (NewAuthResult a) in the context.

The auths aren't used at all (yet), but I think this is a step in the right direction? Now to improve the API to be able to have different ways to do the same thing and make it easy to use.

e.g. define NewAuth '[Required, Strict] '[JwtHeader "Authorization", JwtQueryParam "token"] MyType and only having to implement some form of FromJwt MyType and have it check both the header and the query parameter and just take care of the parsing etc.

Any ideas, comments, critique?

data NewAuth (mods :: [Type]) (auths :: [Type]) (a :: Type)
    deriving (Typeable)

data NewAuthResult a
    = Absent
    | Failed String
    | Success a

type CheckLenient mods a = If (FoldLenient mods) (Either String a) a
type CheckOptional mods a = If (FoldRequired mods) a (Maybe a)
type AuthReturn mods a = CheckOptional mods (CheckLenient mods a)

instance
    ( HasServer api ctxs
    , HasContextEntry ctxs (AuthHandler Request (NewAuthResult a))
    , SBoolI (FoldRequired mods)
    , SBoolI (FoldLenient mods)
    ) => HasServer (NewAuth mods auths a :> api) ctxs where
    type ServerT (NewAuth mods auths a :> api) m =
        AuthReturn mods a -> ServerT api m
    hoistServerWithContext _ pc nt s =
        hoistServerWithContext (Proxy @api) pc nt . s
    route _ context subserver =
        route (Proxy @api) context $
            addAuthCheck subserver authCheck
      where
        authCheck :: DelayedIO (AuthReturn mods a)
        authCheck = withRequest $ \req -> do
            eRes <- liftIO . runHandler $ unAuthHandler authHandler req
            either delayedFail (fromAuthResult (Proxy @mods)) eRes
        authHandler :: AuthHandler Request (NewAuthResult a)
        authHandler = getContextEntry context

fromAuthResult :: forall mods a.
    (SBoolI (FoldRequired mods), SBoolI (FoldLenient mods)) =>
    Proxy mods ->
    NewAuthResult a ->
    DelayedIO (AuthReturn mods a)
fromAuthResult _ result =
    case (result, sbool :: SBool (FoldRequired mods)) of
        (Absent, SFalse) -> pure Nothing
        (Failed s, SFalse) -> Just <$> checkLenient (Left s)
        (Success a, SFalse) -> Just <$> checkLenient (Right a)
        (Absent, STrue) -> delayedFail err401
        (Failed s, STrue) -> checkLenient (Left s)
        (Success a, STrue) -> checkLenient (Right a)
  where
    checkLenient :: Either String a -> DelayedIO (CheckLenient mods a)
    checkLenient esa =
        case (esa, sbool :: SBool (FoldLenient mods)) of
            (Left _, SFalse) -> delayedFail err401
            (Right a, SFalse) -> pure a
            (_, STrue) -> pure esa
tchoutri commented 2 years ago

Update: I have been working (with the help of @divarvel) on a term-level workflow for endpoints that allow optional authentication and endpoints that require authenticated callers.

@Vlix The solution I'm headed towards will not use type-level indicators of leniency, but rather term-level natural transformations. There are several advantages to this approach, especially in terms of understanding the underlying mechanisms, little-to-no typeclasses involved and reducing the entry barrier for this kind of feature set.

Starting mid-january I'll put my focus on providing user support for all this, and help with improving the existing cookbooks.

Edit: Regarding your snippets, while it makes use of some nice type families, they are atrocious to compile and I fear that promoting them on yet-another-usecase (authentication) may lead to an explosion of resources needed at compile-time. It's less a critique of your code itself and more an observation of the drawbacks of the mechanisms that you use.

Vlix commented 2 years ago

I'd like to see what you come up with. 😃

I've taken this approach mainly because that's how servant seems to do things (i.e. ReqBody' mods ctypes a).

If you could explain the "explosion of resources", though. I'd be much obliged. I don't see this being any more expensive than, for example, ReqBody' or QueryParams'. 🤷

Next step would probably be a:

class X auth a where
  handler :: AuthHandler Request (NewAuthResult a)

which users could use to be able to make their own auth checking for if necessary, and some could be provided (e.g. JWTHeader etc.)

(I did notice type RequestArgument already exists... so AuthReturn is redundant)

tchoutri commented 2 years ago

I've taken this approach mainly because that's how servant seems to do things (i.e. ReqBody' mods ctypes a).

You are absolutely right, but then it has two negative effects:

Next step would probably be a [class] which users could use to be able to make their own auth checking for if necessary, and some could be provided (e.g. JWTHeader etc.)

Again, no need to provide typeclass interfaces for this (yet?). The best we can do right now is to provide functions that take care of the logic of retrieving information, requiring data needed for authentication and validating everything.

For example, here is how it works in Flora:

Auth.hs ```haskell module FloraWeb.Server.Auth where type instance AuthServerData (AuthProtect "cookie-auth") = Session -- | Datatypes used for every route that doesn't *need* an authenticated user type FloraPageM = ReaderT Session Handler data Session = Session { sessionId :: PersistentSessionId , mUser :: Maybe User , floraEnv :: FloraEnv } deriving stock (Show, Generic) -- | Datatypes used for routes that *need* an authenticated user type FloraAdminM = ReaderT ProtectedSession Handler data ProtectedSession = ProtectedSession { sessionId :: PersistentSessionId , user :: User , floraEnv :: FloraEnv } deriving stock (Generic) authHandler :: FloraEnv -> AuthHandler Request Session authHandler floraEnv = mkAuthHandler handler where pool = floraEnv ^. #pool handler :: Request -> Handler Session handler req = do let cookies = getCookies req mPersistentSessionId <- getSessionId cookies mPersistentSession <- fetchSession pool mPersistentSessionId mUserInfo <- getUser pool mPersistentSession (mUser, sessionId) <- do nSessionId <- liftIO newPersistentSessionId case mUserInfo of Nothing -> pure (Nothing, nSessionId) Just (user, userSession) -> pure (Just user, userSession ^. #persistentSessionId) pure Session{..} getCookies :: Request -> Cookies getCookies req = maybe [] parseCookies (List.lookup hCookie headers) where headers = requestHeaders req getSessionId :: Cookies -> Handler (Maybe PersistentSessionId) getSessionId cookies = case List.lookup "flora_server_session" cookies of Nothing -> pure Nothing Just i -> case PersistentSessionId <$> UUID.fromASCIIBytes i of Nothing -> pure Nothing Just sessionId -> pure $ Just sessionId fetchSession :: Pool Connection -> Maybe PersistentSessionId -> Handler (Maybe PersistentSession) fetchSession _pool Nothing = pure Nothing fetchSession pool (Just persistentSessionId) = do result <- runExceptT $ liftIO $ withPool pool $ getPersistentSession persistentSessionId case result of Left _ -> throwError err500 Right Nothing -> pure Nothing Right (Just userSession) -> pure $ Just userSession getUser :: Pool Connection -> Maybe PersistentSession -> Handler (Maybe (User, PersistentSession)) getUser _ Nothing = pure Nothing getUser pool (Just userSession) = do user <- lookupUser pool (userSession ^. #userId) pure $ Just (user, userSession) lookupUser :: Pool Connection -> UserId -> Handler User lookupUser pool uid = do result <- runExceptT $ liftIO $ withPool pool $ getUserById uid case result of Left _ -> throwError (err403 { errBody = "Invalid Cookie" }) Right Nothing -> throwError (err403 { errBody = "Invalid Cookie" }) Right (Just user) -> pure user ```

In a sub-tree of routes that need to ensure that our user is authenticated to access the resources:

Server/Pages.hs ```haskell type Routes = ToServantApi Routes' data Routes' mode = Routes' { home :: mode :- Get '[HTML] (Html ()) , about :: mode :- "about" :> Get '[HTML] (Html ()) , admin :: mode :- "admin" :> Get '[HTML] (Html ()) , login :: mode :- "login" :> Sessions.Routes , packages :: mode :- "packages" :> Packages.Routes } deriving stock (Generic) server :: ToServant Routes' (AsServerT FloraPageM) server = genericServerT Routes' { home = homeHandler , about = aboutHandler , admin = ensureUser adminHandler , login = Sessions.server , packages = Packages.server } ensureUser :: FloraAdminM a -> FloraPageM a ensureUser adminM = do Session{sessionId, mUser} <- ask case mUser of Nothing -> renderError forbidden403 Just user -> withReaderT (\Session{floraEnv} -> ProtectedSession{..}) adminM homeHandler :: FloraPageM (Html ()) homeHandler = do let templateEnv = defaultTemplateEnv{displayNavbarSearch = False} render templateEnv Home.show aboutHandler :: FloraPageM (Html ()) aboutHandler = do render defaultTemplateEnv Home.about adminHandler :: FloraAdminM (Html ()) adminHandler = undefined ```

And in my Server.hs module, it is used like this:

Server.hs ```haskell module FloraWeb.Server where data Routes mode = Routes { assets :: mode :- "static" :> Raw , pages :: mode :- AuthProtect "cookie-auth" :> Pages.Routes } deriving stock (Generic) runServer :: FloraEnv -> IO () runServer floraEnv = do let server = genericServeTWithContext (naturalTransform floraEnv) floraServer (genAuthServerContext floraEnv) let warpSettings = setPort (fromIntegral $ httpPort floraEnv ) $ defaultSettings runSettings warpSettings server floraServer :: Routes (AsServerT FloraM) floraServer = Routes { assets = serveDirectoryWebApp "./static" , pages = \session -> hoistServer (Proxy @Pages.Routes) (\x -> withReaderT (const session) Pages.server x) } naturalTransform :: FloraEnv -> FloraM a -> Handler a naturalTransform env app = runReaderT app env genAuthServerContext :: FloraEnv -> Context '[AuthHandler Request Session] genAuthServerContext floraEnv = authHandler floraEnv :. EmptyContext ```

All this to say: We can have reliable auth mechanisms without the customs that servant-auth introduced before we had Generalized Auth. No need to introduce new typeclasses, new interfaces. Our primary needs are:

tchoutri commented 2 years ago

If you could explain the "explosion of resources", though. I'd be much obliged. I don't see this being any more expensive than, for example, ReqBody' or QueryParams'. shrug

It is not necessarily more expensive than ReqBody and QueryParams, but it adds to the compilation cost. If we want to avoid requiring more than 8GB of RAM to compile Servant APIs, we should avoid creating a whole new legion of type families and typeclasses for authentication when we have what we need at the terms level (which is much more optimised). :)

Vlix commented 2 years ago

Hmm, maybe an idea to do both the term level and type level solutions?

Type families might speed up significantly in the (near) future

alpmestan commented 2 years ago

A lot of the compilation time cost with servant APIs isn't due to type families themselves, has more to do with how servant stuffs are represented in Core and what the optimiser does on that, as you can see in https://github.com/haskell-servant/servant/issues/986

So while Sam's (and Adam, etc) work is great, I suspect it'll take a lot more than that to speed up compilation of servant-heavy modules. This line of work is likely to help quite a bit with speeding up things like servant-flatten. A little type family or typeclass here and there in those WIP revisted auth designs are not gonna add such a huge burden I suspect.

Vlix commented 2 years ago

A lot of the compilation time cost with servant APIs isn't due to type families themselves, has more to do with how servant stuffs are represented in Core and what the optimiser does on that, as you can see in #986

I ran the same benchmark they did in #986 and with NamedRoutes, it seems it's not quadratic anymore 😀

tchoutri commented 2 years ago

@Vlix Could you maybe share the data from your benchmarks? :) Or a visual representation of the difference between NamedRoutes and the other techniques?

Vlix commented 2 years ago

@tchoutri Added it in an extra comment in the original issue. (#986) :+1:

Vlix commented 2 years ago

I've also made a PoC of a type families based Auth handling -> #1560 If anyone has any feedback, that'd be great.

EDIT: Seems I misappropriated the speed up to NamedRoutes, it was actually just the new 0.19 version of servant(-server)

jasonzoladz commented 2 years ago

Please don't deprecate/remove generalized auth. We rely on it. :)

Auth is precisely the sort of thing that users need to customize without constraints. If anything, it would be great to see generalized auth moved out of experimental status.

Other more 'blessed' solutions can coexist alongside generalized auth.

tchoutri commented 2 years ago

Generalised Authentication is not going anywhere, I can assure you of it. :)