jupyterhub / jupyter-server-proxy

Jupyter notebook server extension to proxy web services.
https://jupyter-server-proxy.readthedocs.io
BSD 3-Clause "New" or "Revised" License
347 stars 148 forks source link

feat: support stream api #479

Closed ganisback closed 2 weeks ago

ganisback commented 2 months ago

currently jupyter-server-proxy does not support the text/event-stream api. when we install https://github.com/hiyouga/LLaMA-Factory in the notebook instance, and access by : http://xxx.xxx/proxy/7860/ the normal request works fine, but the stream API never response. reference code: https://github.com/ideonate/jhsingle-native-proxy/blob/3e26a4ee0e7318970b7cf6abbd7d88455a9ac621/jhsingle_native_proxy/proxyhandlers.py#L217

ganisback commented 2 months ago

@yuvipanda please help review this feature

manics commented 2 months ago

@ganisback Thanks for your contribution! To help with review please could you expand the PR description to explain what this does, and why it's needed? We'll need some tests if this is going to be merged, but those can be added later if we agree this feature should be added.

ganisback commented 2 months ago

@manics description updated

yuvipanda commented 2 months ago

Ah interesting! I think @jmunroe also ran into this recently.

If my understanding is right, the existing code doesn't stream but waits for the entire upstream request to be completed before passing it on. And with this PR, we're special casing the text/eventstream to be streaming.

Instead, what do you think about modifying the existing code (what you have as _proxy_normal) to be streaming? I don't think there's any reason to not stream all requests - I think we initially waited for requests to complete simply because that was the easier thing to do. That would also help with tests - the current tests should catch that and if they pass, we can probably merge that.

So I propose that instead of only streaming eventstreams, we stream everything.

Can't do this, see https://github.com/jupyterhub/jupyter-server-proxy/pull/479#issuecomment-2177338362

yuvipanda commented 2 months ago

Ah damn, I just realized that won't work, because of our RewriteableResponse implementation :( That implementation's signature depends on being able to buffer requests, rather than being able to stream them. Oof :(

yuvipanda commented 2 months ago

As for testing, I suggest:

  1. Create a simple eventstream emitting server in https://github.com/jupyterhub/jupyter-server-proxy/tree/main/tests/resources (similar to the websocket one perhaps)
  2. Create tests in test_proxy similar to the websocket tests (https://github.com/jupyterhub/jupyter-server-proxy/blob/234a192793a6a46bc8df18f730ea0060e6d0c3e8/tests/test_proxies.py#L346).

This can be simpler than websockets, because eventstreams are simpler.

yuvipanda commented 2 months ago

ok, I've done one round of review @ganisback! Hope this helps.

@manics I think we should definitely add this feature! Ideally, I'd have liked us to make everything streaming, but that's incompatible with RewriteableResponse :( So making at least eventstreams (which by definition need to be streaming) streaming (and hence incompatible with RewriteableResponse) seems ok to me.

ganisback commented 2 months ago

OK, will address your comments in the weekend

yuvipanda commented 2 months ago

Hey @ganisback! Just wanted to check to see if there's anything else I can do to help you move this forward :) I just ran into this again in a different project!

ganisback commented 2 months ago

Hi @yuvipanda , I am blocking in testing case, can you help add the testing case?

yuvipanda commented 2 months ago

@ganisback it was slightly tricky, but I think I have a useful test for you at https://github.com/ganisback/jupyter-server-proxy/pull/1! It fails on main but passes on this branch!

LMK what you think of it. I haven't had time to look at your changes yet unfortunately but will do this coming week!

ganisback commented 2 months ago

@yuvipanda I increased resp time and added resp data checking, now the tests passed.

ganisback commented 1 month ago

@yuvipanda any other comments?

krassowski commented 1 month ago

Excited for this! Spent to much of my Saturday debugging why SSE was connected but not sending anything.

krassowski commented 1 month ago

@yuvipanda @manics is there anything else that needs to be done here to get it merged?

krassowski commented 1 month ago

Hi folks, just circling back here, is there any way I could help with this PR?

ganisback commented 4 weeks ago

Hi folks, just circling back here, is there any way I could help with this PR?

I'm just waiting for they merge this PR

yuvipanda commented 3 weeks ago

Sorry for dropping this, travel / vacation time :( I'll try to spend an hour on this later this week.

krassowski commented 2 weeks ago

It looks like pre-commit is failing because this PR is behind main branch as of now, for example it does not include the rawsocket.py file which was added last month (https://github.com/jupyterhub/jupyter-server-proxy/blob/main/jupyter_server_proxy/rawsocket.py)

As per https://results.pre-commit.ci/run/github/71295164/1723882819.1ODn2EWuTr2D6HI5NFt0pQ

image

krassowski commented 2 weeks ago

I opened a PR against this branch (https://github.com/ganisback/jupyter-server-proxy/pull/2) to fix pre-commit checks

yuvipanda commented 2 weeks ago

I still don't like using regexes for parsing headers, but I this is a clear improvement over status quo. So I've rebased this, and will merge. Apologies for the long time this one took, @ganisback!

I've opened https://github.com/jupyterhub/jupyter-server-proxy/issues/498 to handle removing the regexes.

krassowski commented 2 weeks ago

Amazing, thanks so much! Do you plan to cut a new release with it now or only after https://github.com/jupyterhub/jupyter-server-proxy/issues/498 gets addressed?

yuvipanda commented 2 weeks ago

498 doesn't need to block a release!

yuvipanda commented 2 weeks ago

If you make a PR following https://github.com/jupyterhub/jupyter-server-proxy/blob/main/RELEASE.md i'm happy to merge that :)

yuvipanda commented 2 weeks ago

Although i'm off from monday!