xyncro / freya-types

Freya Web Stack - Types and Parsers
https://freya.io
Other
1 stars 8 forks source link

Allow access to raw output stream #3

Open neoeinstein opened 7 years ago

neoeinstein commented 7 years ago

Currently Representation only allows for supplying a byte array, which is then streamed out over the wire. This means that the byte array must be pre-constructed. This is not always convenient or efficient, especially for larger replies.

Proposal: Alter Representation by changing the type of the Data field from byte array to ResponseData. ResponseData would be defined as:

type ResponseData =
  | Bytes of byte array
  | DelayedStream of Stream -> Async<unit> // Job<unit> for Hopac

This will allow for streaming responses, which may mean less memory pressure in the long run. (I see an opportunity for Chiron to be able to stream its data to the wire in this way.)

This would be a breaking change, requiring the addition of the word Bytes to existing code that explicitly creates Representation records.

et1975 commented 7 years ago

Would love some way to stream large content and at the moment I can't find a way to do this correctly. I'm not sure what my impl would look like given the Stream -> Async<unit> signature though or how would it fit with the rest of freya computations... could use an example.

panesofglass commented 7 years ago

In general, I'm a big fan of this change. I wonder, though, whether the DelayedStream has the right signature. I would also like to see an example. Might a Freya computation be a better abstraction to return?

neoeinstein commented 7 years ago

The degenerate implementation of a delayed stream would take a simple byte array. When using ByteArray, the default output for the representation would need to set a Transfer-Encoding: chunked header, since the content length won't be known before sending the content.

let degenerateCase bytes (stream: System.IO.Stream) =
  async {
    do! stream.WriteAsync(bytes, 0, bytes.Length) |> Async.AwaitIAsyncResult |> Async.Ignore
  }

let represent bytes =
  { Data = DelayedStream (degenerateCase bytes)
    Description = { … } }

Substitute degenerateCase with some function that takes your DTO and sends it to a stream, and you have a delayed write.

neoeinstein commented 7 years ago

@panesofglass Not sure that it would. It could be something that also gets some of the Freya state, but that would complicate the signature when we could just rely on the caller injecting the necessary dependencies through partial application.

Once we begin writing to the stream, the status and headers get sent, so all attributes of the response are no longer editable. It doesn't make much sense to expose them here if the consumer can't really use them.

et1975 commented 7 years ago

since the content length won't be known before sending the content.

I don't know if this would be universally true, in my current scenario it is known (but large).

rely on the caller injecting the necessary dependencies through partial application.

how would this work if params injected are freya computations themselves?

neoeinstein commented 7 years ago

@et1975 On content length, we could have DelayedStream take an additional ContentLength = KnownLength of uint64 | UnknownLength parameter. If a KnownLength, it would give a Content-Length header. If not, it would emit a Transfer-Encoding: chunked header.

On injections from freya computations, you can provide them in the freya {…} computation where you create the representation.

let handleOk =
  freya {
    let! method = Freya.Optic.get Request.method_
    return
      { Data = DelayedStream (useMethodInOutputStream method)
        Description = {…} }
  }
neoeinstein commented 7 years ago

/cc @ajgajg1134 from my own team

btrepp commented 7 years ago

Is the motivation to get memory usage down (eg remove preallocation? eg byte[] seq) or to have the server send more push-like data? (Async<byte[]> seq or some sort of Rx/FRP/Event based system?.)

Usually in my mind chunked-encoding looks like type ChunkedData = byte[] seq Which would satisfy processing a chunk at a time, with a seq based serialize, plus its easy enough to fall back/have Freya handle the encoding automatically.

Also https://tools.ietf.org/html/rfc2616#section-3.6. Implies that headers MAY be written after the body

      Chunked-Body   = *chunk
                        last-chunk
                        trailer
                        CRLF

       chunk          = chunk-size [ chunk-extension ] CRLF
                        chunk-data CRLF
       chunk-size     = 1*HEX
       last-chunk     = 1*("0") [ chunk-extension ] CRLF

       chunk-extension= *( ";" chunk-ext-name [ "=" chunk-ext-val ] )
       chunk-ext-name = token
       chunk-ext-val  = token | quoted-string
       chunk-data     = chunk-size(OCTET)
       trailer        = *(entity-header CRLF)

This stack overflow has an example of it https://stackoverflow.com/questions/5590791/http-chunked-encoding-need-an-example-of-trailer-mentioned-in-spec

I think if it's a more event-y (eg Server Sent Events thing) maybe it should be outside of Representation, and use the static overloads to allow handleX to use it.

Eg It now takes Freya or Representation. So maybe it can take Freya or the like. That way existing code doesn't break.

et1975 commented 7 years ago

Since we are looking at breaking changes, it would be ideal if it could also resume correctly (support range headers). @neoeinstein would that still fit into Representation?