haskell-servant / servant

Servat is a Haskell DSL for describing, serving, querying, mocking, documenting web applications and more!
https://docs.servant.dev/
1.83k stars 413 forks source link

Support multipart/form-data encoding in servant-client #884

Open ptitfred opened 6 years ago

ptitfred commented 6 years ago

An experimental support server side is already there with https://github.com/haskell-servant/servant-multipart following issue #133 . It would be cool to support this for client derivation too.

I'd be happy to help even though I'm no contributor yet, just a user.

alpmestan commented 6 years ago

Hello @ptitfred :)

I'll walk you through the essential bits of servant-client/servant-client-core that have something to do with this task.

The first thing is the HasClient class:

class RunClient m => HasClient m api where
  type Client (m :: * -> *) (api :: *) :: *
  clientWithRoute :: Proxy m -> Proxy api -> Request -> Client m api

Just like with the server side, we "compute" the type of client functions through Client, but most importantly (as far as actually building HTTP requests is concerned), clientWithRoute takes a Request argument, which is just a special case of RequestF which is a record that synthesizes everything we care about for building an HTTP request: path, headers, body, etc.

The key idea behind the implementation of servant-client is that while we traverse say type API = "foo" :> Get '[JSON] String to figure out what types the clients should have and what the client functions should do, we get to find out what we need by considering the effect of each combinator in the API type. Indeed, the HasClient instances for almost everything but the Get, Post, etc combinators are all of the form instance (..., HasClient m subapi) => HasClient m (SomeCombinator arg1 arg2 :> subapi) where .... Given that we know how to deal with the subapi API type, we know how to deal with that API prefixed with SomeCombinator. And the implementation of clientWithRoute is therefore "recursive", in that we t ake the Request argument, tweak it to add some data that we get from SomeCombinator (a path fragment, a capture, a request body etc) and then call clientWithRoute for subapi, giving it as its Request argument not the one that our function got, but the tweaked version that takes into account SomeCombinator.

A good example is the instance for static path fragments, that we introduce with type-level strings:

instance (KnownSymbol path, HasClient m api) => HasClient m (path :> api) where
  type Client m (path :> api) = Client m api

  clientWithRoute pm Proxy req =
     clientWithRoute pm (Proxy :: Proxy api)
                     (appendToPath p req)

    where p = pack $ symbolVal (Proxy :: Proxy path)

We update the Request to append the static string to the path and then proceed with the rest of the API type. The Request that we get after having triggered all the HasClient instances relevant to our API type (at the end of the chain of recursive-but-at-a-different-type calls) is what will be used to build an http-client request value that we then send to some server.

OK, with all of this in mind, what's standing in the way of multipart support? First, http-client supports it see the dedicated module so it's all about wiring all of that up to servant-client[-core]. Now... if you take another look at the Request/RequestF data types (here), you can see that our multipart data would have to go in the requestBody :: Maybe (RequestBody, MediaType) field, where we would use a suitable MediaType value for multipart/form-data (might not be all that trivial, doesn't multipart include the separator in the content type or something like that?) and then fill the request body with the content of the files and possible textual inputs encoded suitably.

So, what's the big deal then? Well, if you take a look at RequestBody, you can see that it can only represent request bodies as lazy bytestrings, which is a no-go for us, as clientWithRoute doesn't allow us to do any IO so we can't just read the contents of the files, stick it all in our Request and move on with the rest of the API type. We really need another constructor I think, one which can contain all the data needed to produce the multipart request body. And then we'd have to change the suitable places to handle this new constructor correctly when converting Requests from servant-client-core to http-client requests. And once we have that, then we could write something close to:

instance (ToMultipart a, HasClient m subapi) => HasClient m (MultipartForm backend a :> subapi) where
  type Client m (MultipartForm backend a :> subapi) = a -> Client m subapi
  clientWithRoute mproxy apiproxy req multipartData = clientWithRoute mproxy (Proxy @ subapi) (addMultipartRequestBody multipartData req)

assuming we introduce a suitable ToMultipart class and implement the key function here, addMultipartRequestBody :: ToMultipart a => a -> Request -> Request. This function is precisely the piece that we're missing now AFAICT, along with the suitable changes to RequestBody etc.

Let me know if anything's not clear or if you have any question :)

ptitfred commented 6 years ago

Thank you for this detailed walk-through. Very instructive and it demystifies the HasClient instances that are quite horrifying on haddock.

First, http-client supports it see the dedicated module so it's all about wiring all of that up to servant-client[-core].

I have already used it to solve my main use-case. So let's say I'm comfy with that part at least ^^.

Few questions:

alpmestan commented 6 years ago

Is servant-client actually using http-types and http-client?

As you can see here, servant-client does use http-types and http-client. servant-client-core on the other hand does not depend on http-client, it's the backend agnostic part of what used to be servant-client. And now "servant-client" is the http-client specific part.

So, to answer:

I'm a bit confused by Servant.Client.Core.Internal.Request, how am I supposed to reuse Network.HTTP.Client.MultipartFormData then?

if we want to solve this well, we'd ideally like to have a representation for the request body that does not tie us to http-client's types. We could add a type variable to Request that would be instantiated to http-client's multipart types in the http-client backend but otherwise left abstract in servant-client-core. Or define a few core multipart types yourself, and have the http-client backend do a mapping between those types and http-client's. Doesn't matter much to me and I'm not sure I know what the best option. It probably requires someone to sit down and try one or two approaches and see how that goes. Other ideas might work better too. But yeah, the gist of it is that: 1/ we'd like to avoid using http-client's types in the backend agnostic part 2/ we can achieve this in various ways, not sure which one is better at this point. But we definitely need to extend the RequestBody representation to include something that can be mapped one way or another to http-client's multipart types.

Am I supposed to provide one/many instances of MimeRender (following the instance of HasClient for ReqBody)?

No, and similarly servant-multipart doesn't provide a MimeUnrender instance, the multipart stuffs unfortunately isn't just a simple content type but instead its own dedicated combinator, because multipart forces us to do a little more than just encoding/decoding (write file to /tmp, etc). So it would not make sense to have it be a content type on the client side. All we need is that HasClient instance.

Each part of a multipart might come with it's own content-type and "encoding" (that MimeRender provides if I'm not mistaken). Would that be an issue? Do we have to hide this subtleties in the ToMultipart instances?

No, it'd be up to the user, in the ToMultipart instance, to specify how to map from their data types to the format that spells out all the textual inputs and files, with associated mime. None of this would appear in the API type though.

Regarding IO support for requestBody building, should we consider the approach from http-client which proposes the RequestBodyIO (IO RequestBody) constructor for its own RequestBody? (I may be confusing concepts there, total noob there.)

Maybe! Maybe not. I'm not sure just yet. You could give a shot at this approach first and see how it goes. It's either something like this or just storing the textual inputs + paths to files & associated data, so that you can do the IO later, right before you send the request.

Hopefully this will help, but feel free to ask other questions any time and consider this issue as a whiteboard or something.

nmattia commented 6 years ago

Hi @ptitfred, I hit the same issue yesterday. Here's what I came up with, feel free to use this as inspiration:

{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE InstanceSigs #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE TypeOperators #-}

{-# OPTIONS_GHC -fno-warn-orphans #-}

{-|

  Copyright: 2017 (C) AlphaSheets, Inc
  Description: Client support for multipart.

-}

module Servant.Client.Multipart where

import Control.Monad
import Control.Monad.IO.Class (MonadIO(..))
import Data.Monoid
import Data.Sequence as Seq
import Network.HTTP.Client.MultipartFormData (webkitBoundary)
import Network.HTTP.Media.MediaType
import Network.HTTP.Types
import Servant hiding (Header)
import Servant.Client
import Servant.Client.Core as SCC
import Servant.Multipart
import System.FilePath

import qualified Data.ByteString as BS
import qualified Data.CaseInsensitive as CI
import qualified Data.Foldable
import qualified Data.Text as T
import qualified Data.Text.Encoding as TE
import qualified Network.HTTP.Types as HTTP
import qualified Network.Mime as Mime

-- | A single part of a multipart message.
-- Adapted from @Network.HTTP.Client.MultipartFormData.Part@
data Part = Part
    { partName :: T.Text -- ^ Name of the corresponding \<input\>
    , partFilename :: Maybe String -- ^ A file name, if this is an attached file
    , partContentType :: Maybe Mime.MimeType -- ^ Content type
    , partHeaders :: [HTTP.Header] -- ^ List of additional headers
    , partGetBody :: IO BS.ByteString -- ^ Action in m which returns the body
                                   -- of a message.
                                   -- XXX: This is servant' requestbody
    }

-- | A class for encoding datatypes as 'Part's.
--  See 'Part', 'partInput', 'partFile'
class ToMultipart a where
  toMultipart :: a -> [Part]

-- A somewhat meaningful instance of 'HasClient' for multipart data. Note that
-- when a file 'Part' is encountered, the whole file will be loaded up in
-- memory. As such,
--
--  !!!!THIS SHOULD BE USED WITH CAUTION!!!!
--
-- See https://github.com/haskell-servant/servant/issues/886
--
instance (ToMultipart ty, Client m sub ~ m a, MonadIO m, HasClient m sub) =>
    HasClient m
      (MultipartForm Tmp ty :> sub) where
  type
    Client m
      (MultipartForm Tmp ty :> sub)
      = ty -> Client m sub
  clientWithRoute
    :: Proxy m
    -> Proxy (MultipartForm Tmp ty :> sub)
    -> Request
    -> ty -> Client m sub
  clientWithRoute pMonad _p2 req dfi = do
      -- Adapted from @Network.HTTP.Client.MultipartFormData.formDataBody@
      boundary <- liftIO webkitBoundary
      body <- liftIO $ renderParts boundary $ toMultipart dfi
      let req' = req
            { requestHeaders =
                Seq.filter (\(x, _) -> x /= hContentType) (requestHeaders req)
            , requestBody = Just
                ( RequestBodyLBS $ undefined body
                , "multipart" // "form-data" /: ("boundary", boundary)
                )
            }
      clientWithRoute pMonad (Proxy @ sub) req'

-------------------------------------------------------------------------------
-- XXX: The functions 'renderParts' and 'renderPart' were adapted from
-- @Network.HTTP.Client.MultipartFormData@.
-------------------------------------------------------------------------------

-- | Combine the 'Part's to form multipart/form-data body
renderParts :: BS.ByteString    -- ^ Boundary between parts.
            -> [Part] -> IO BS.ByteString
renderParts boundary parts = (fin . mconcat) `liftM` mapM (renderPart boundary) parts
  where fin = (<> "--" <> boundary <> "--\r\n")

renderPart :: BS.ByteString     -- ^ Boundary between parts.
           -> Part -> IO BS.ByteString
renderPart boundary (Part name mfilename mcontenttype hdrs get) =
    fmap render get
  where render renderBody =
            "--" <> boundary <> "\r\n"
         <> "Content-Disposition: form-data; name=\""
         <> TE.encodeUtf8 name
         <> (case mfilename of
                 Just f -> "\"; filename=\""
                        <> TE.encodeUtf8 (T.pack $ takeFileName f)
                 _ -> mempty)
         <> "\""
         <> (case mcontenttype of
                Just ct -> "\r\n"
                        <> "Content-Type: "
                        <> ct
                _ -> mempty)
         <> Data.Foldable.foldMap (\(k, v) ->
               "\r\n"
            <> CI.original k
            <> ": "
            <> v) hdrs
         <> "\r\n\r\n"
         <> renderBody <> "\r\n"

-- | Create a key-value 'Part'.
partInput :: T.Text -> BS.ByteString -> Part
partInput n b = Part
  { partName = n
  , partFilename = Nothing
  , partContentType = Nothing
  , partHeaders = []
  , partGetBody = pure b
  }

-- | Create a 'Part' for a local file. This does not infer the mime type of the
-- file, although one may be provided.
partFile
  :: T.Text
    -- ^ The part name
  -> FilePath
    -- ^ The path to the local file
  -> Maybe Mime.MimeType
    -- ^ Optional mime type
  -> Part
partFile n fp mmt = Part
  { partName = n
  , partFilename = Just fp
  , partContentType = mmt
  , partHeaders = []
  , partGetBody = BS.readFile fp
  }
jkarni commented 6 years ago

Shouldn't this be moved to the servant-multipart repo?

ptitfred commented 6 years ago

@jkarni I don't know, maybe @alpmestan can answer

alpmestan commented 6 years ago

I don't care, to be honest. :-)

Profpatsch commented 4 years ago

According to the changelog for servant-multipart, client support has been added in the last version: https://hackage.haskell.org/package/servant-multipart-0.11.5/changelog