sneako / finch

Elixir HTTP client, focused on performance
MIT License
1.23k stars 114 forks source link

Finch.stream, again #236

Closed hissssst closed 1 day ago

hissssst commented 11 months ago

Given what was said in #235, it would be nice to have a function

Finch.stream(req)

Which would

  1. Return a Stream.t()
  2. And this stream would get lazily read from socket upon demand
  3. Each item of the steam is {:status, pos_integer()} | {:header, name :: binary(), value :: binary()} | {:body, binary()}
josevalim commented 11 months ago

Due to connection pooling and the state in the connection, it is not trivial to support this type of streaming. If you want lazy streaming, then Finch.stream/5 is the best option for now. I think this can be closed as I am almost sure this was discussed in the past. :)

sneako commented 11 months ago

Thanks for chiming in @josevalim !

hissssst commented 11 months ago

@josevalim Yes, we've discussed, but the previous time the issue was closed due to new approach for streaming. This approach is discussed in #235 and it is an antipattern. I still demand for a proper streaming API, since existing solution with callbacks requires me to re-copy every part of the body to form a stream using send/receive, which is just unnecessary (and it's also the point why solutions like Mint and NimblePool exist in the first place).

So, even if you don't have a solution right now, or not planning to implement it in the future, it is definitely worth to have this issue open to track progress and ideas if someone will have a time to implement it. @sneako , please, reopen.

My first suggestion

  1. The main problem here is that NimblePool.checkout! accepts a fun, so I'd suggest implementing separate manual_checkout and manual_checkin functions which a developer can use

  2. With manual checkout/checkin functions in NimblePool, we'll need to implement something like

    conn = NimblePool.manual_checkout(...)
    do_stream(conn)
    |> Stream.concat(Stream.resource(fn -> :ok end, fn _ -> {:halt, _} end, fn _ -> NimblePool.manual_checkin(conn) end))

My second suggestion

owner = self()
holder =
  spawn_link fn -> NimblePool.checkout!(..., fn conn ->
    send(owner, conn)
    receive do
      :release -> :ok
    end
  end)

conn =
  receive do
    conn -> conn
  end

do_stream(conn)
|> Stream.concat(Stream.resource(fn -> :ok end, fn _ -> {:halt, _} end, fn _ -> send(holder, :release) end))

It is missing timeouts, refs and other stuff. My point is just to share an idea

josevalim commented 11 months ago

As the creator of NimblePool I don't see how to implement solution 1 without adding tons of complexity to the project. I will be glad to be proven wrong.

And solution 2 could easily be added on top of the stream/5 API but, given it spawns a new process, I don't see a point of having it as first class citizen in Finch.

One alternative is to have an option to async_request such as a mode: :active | :receive. On :receive mode, you need to explicitly do Finch.next_async/1 to allow the next message to be sent. However, I don't see a point in wrapping it on a stream either. (cc @zachallaun)

Of course, it is @sneako's project to manage, but again, I don't see a reason to keep a feature request open which has been extensively discussed before and we don't have known acceptable solutions for. I would rather focus on actionable ones. Regardless, as open source users, we are not in position to demand something. At best we can ask politely and/or fork. :)

zachallaun commented 11 months ago

I've been thinking about this since seeing these issues raised by @hissssst a few days ago and here are some thoughts:

hissssst commented 11 months ago

@josevalim

And solution 2 could easily be added on top of the stream/5 API but, given it spawns a new process, I don't see a point of having it as first class citizen in Finch.

No, it can't. The only possible solution with stream/5 is to send every message to external process. This essentially copies the whole response one time. While my solution just copies the resource, which doesn't lead to copying the whole response. And that's the key difference here

josevalim commented 11 months ago

Ah, I missed that. But even that assumes the connection will be happy to be transferred to another process and that there will be no leakage in case of errors. Feel free to explore that, it may provide an alternative path we haven't considered, but please double check both NimblePool and Mint will be happy with the resource transfer under several different scenarios. Then it can even be the sketch of a PR we could all collaborate on.

hissssst commented 10 months ago

@sneako , if you reopen this, I will provide a PR for the issue in a few days or so.

josevalim commented 9 months ago

@hissssst ping? :)

hissssst commented 9 months ago

Pong, I was having a vacation and switching jobs. This issue is somewhere in my TODO, but I don't have a free time to do this in a next two weeks or so.

dlindenkreuz commented 2 days ago

@hissssst 🛎️ it would be great to have this some day :)