haskell-servant / servant

Main repository for the servant libraries — DSL for describing, serving, querying, mocking, documenting web applications and more!
https://docs.servant.dev/
1.8k stars 407 forks source link

Streaming response not streaming all values #1011

Open ocharles opened 5 years ago

ocharles commented 5 years ago

Given the following server, if I connect with curl I only see true, false. I expect to see true, false, true. The logging in the program does indicate that both False and a second True are sent (using yieldMany).

{-# language TypeApplications #-}

module Main where

import Servant
import Servant.Server
import RIO
import qualified Network.Wai.Handler.Warp

type API = StreamGet NewlineFraming JSON ( StreamGenerator Bool )

main :: IO ()
main = Network.Wai.Handler.Warp.run 8080 $ serve ( Proxy @API ) $ do
  chan <- newTChanIO
  traverse_ ( atomically . writeTChan chan ) [ True, False, True ]

  return $ StreamGenerator $ \yieldFirst yieldRest -> do
    atomically ( readTChan chan ) >>= yieldFirst

    forever $ do
      msg <- atomically ( readTChan chan )
      putStrLn ( " !!!!! " <> show msg )
      yieldRest msg
      putStrLn "OK"
gbaz commented 5 years ago

I suspect that what's happening is curl isn't showing all the data received with byte-buffering. Rather it is line-buffering, and it only gets the newline as the leading bit when the next piece of data is sent.

If the streamgenerator didn't hang waiting on input that wasn't forthcoming but instead closed the connection properly, then you'd get the last line, I think.

ocharles commented 5 years ago

It's not just curl, @alpmestan could reproduce this using a stock servant-client.

ocharles commented 5 years ago

Also, it's not that the very last message isn't sent, if I later unblock chan by writing more values, it will always be a message out of sync. I imagine forkIO (forever (getLine >>= atomically . writeTChan chan)) would essentially show this (I can try and update the repro tomorrow if it helps)

alpmestan commented 5 years ago

(The servant-client repro in question: https://gist.github.com/alpmestan/011ed92460e40f3b2a618e2d651ecbdb -- with the suitable magic at the top to be chmod +x'd and executed. Nix & runhaskell will do the rest, it's completely self-contained)

gbaz commented 5 years ago

Right. But you can see what's happening with netcat

$ printf "GET / HTTP/1.0\r\nHost: localhost\r\n\r\n" | nc localhost 8080 HTTP/1.0 200 OK Date: Thu, 19 Jul 2018 03:26:24 GMT Server: Warp/3.2.23 Content-Type: application/json;charset=utf-8

true false true

Note that with netcat, while the connection continues, you can actually see the cursor blinking at the end of the last true line. So the real issue is that this transport only emits newlines at the start of the next thing to get proper interleaving, and this doesn't interact with streaming infinite data to clients that use line buffering. That's because newlines are frame boundaries in this transport, not frame terminators.

Checking the ndjson spec (https://github.com/ndjson/ndjson-spec) it appears that the latter behavior would me more correct for that case -- and thus, perhaps, in general. It would be easy enough to change.

ocharles commented 5 years ago

Alright, I don't entirely understand the semantics but put it this way - the behavior in Alp's example that combines servant-client is ultimately not what I want. When things get written to the stream, they should reach the client without a connection hang up or another element needing to be written to the stream.

For motivation, my use case is to live stream a log from a long running process on the server side back to the client. I'm using servant-client as transport to a back end in a CLI app. Right now, operators don't see the very last long line, which can be very confusing - especially when the last log line mentions that something long running has begun executing. The last line they see might just be Done!

phadej commented 5 years ago

I'd suggest writing own NewlineFramingOcharles which would append \n and not intersperse. If that works - you have a solution.

I'm not keen for anyone of servant contributors to spend time investigating this until #991 is landed, as than it's way more easy. (Sorry for lingering that PR too)

gbaz commented 5 years ago

Or try netstring framing, which should be more robust?

ocharles commented 5 years ago

netstring framing didn't work at all when I tried it (I got a decoding error after about 3 JSON messages), but I'll try both suggestions and report back when I get into the office.

ocharles commented 5 years ago

Ok, a NewlineTerminating framing strategy gets me closer in that curl sees messages as I intend. However, it still doesn't work properly with servant-client. From my understanding of this function, I will only see one element of the stream per block-on-data.

To provide an example, we connect to the server and issue the request. The server begins yielding data - a JSON object and then a newline, and then blocks. When the client demands some data, go is called in the above linked function, which ultimately calls processResult <$> modifyMVar state frameLoop. This demands data from the server (building up an accumulating value), and tries to parse a single value. Once that single value is parsed, it is returned to the client. Now the server sends 2 json objects (with newlines), and our client demands another value. The bytes are fetched from the server, and we can parse a frame so we return it. Now the server is idle, and the client wants a third frame. If I try and fetch this, I'll call go, which will block on more data from the server... despite the fact that we have another frame sat ready to go from the last chunk of data given to us!

At this point, I really am stuck. I will explore making my own instance OVERLAPPABLE_ ( RunClient m, MimeUnrender ct a, ReflectMethod method, FramingUnrender framing a, BuildFromStream a (f a) ) => HasClient m (Stream method status framing ct (f a)) where by newtyping Stream tomorrow, and I'll reimplement this function to try and empty the accumulation buffer before blocking on more data.

phadej commented 5 years ago

@ocharles I see. I'm quite sue that #991 will solve the servant-client problem. It should be already usable, it would be very nice and valuable if you could try it out.

ocharles commented 5 years ago

Certainly! I'll take a look

On Thu, 19 Jul 2018, 8:44 pm Oleg Grenrus, notifications@github.com wrote:

@ocharles https://github.com/ocharles I see. I'm quite sue that #991 https://github.com/haskell-servant/servant/pull/991 will solve the servant-client problem. It should be already usable, it would be very nice and valuable if you could try it out.

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/haskell-servant/servant/issues/1011#issuecomment-406391963, or mute the thread https://github.com/notifications/unsubscribe-auth/AABRjjfMlsaj8K2zz25FpB5jEGRp0oXQks5uIOGFgaJpZM4VUOaz .

ocharles commented 5 years ago

991 alone isn't a fix, but it is a fix when combined with:

data NewlineTerminated

instance FramingRender NewlineTerminated where
    framingRender _ f = mapStepT go where
        go Stop        = Stop
        go (Error err) = Error err
        go (Skip s)    = Skip (go s)
        go (Yield x s) = Yield (f x <> "\n") (go s)
        go (Effect ms) = Effect (fmap go ms)

instance FramingUnrender NewlineTerminated where
    framingUnrender _ f = transformWithAtto $ do
        bs <- A.takeWhile (/= 10)
        () <$ A.word8 10 <|> A.endOfInput
        either fail pure (f (LBS.fromStrict bs))

At this point, everything yielded in the server shows up in the client, without requiring another message.

gbaz commented 5 years ago

Great! I think that newline terminated is probably an overall correct replacement for newline framing. The primary usecase for newline framing was newline separated json, and the spec does seem like it it specifies terminated over interleaved for that.

The same issue arises for e.g. line-by-line csvs, but again, I think that terminated vs. interleaved will give overall better behavior.

phadej commented 5 years ago

I think I'll just change NewlineFraming in the next release (after #991 is landed) to what @ocharles pointed out. We can still have NewlineInterleavedFraming if someone really wants that

phadej commented 5 years ago

Note to self, test without <|> A.endOfInput as we can (should?) require terminator after each message

domenkozar commented 5 years ago

Is this fixed in 0.15?