jupyterhub / binderhub

Run your code in the cloud, with technology so advanced, it feels like magic!
https://binderhub.readthedocs.io
BSD 3-Clause "New" or "Revised" License
2.54k stars 388 forks source link

Do not escape query included in url launch parameter #1774

Closed manics closed 11 months ago

manics commented 11 months ago

I'm not sure if this test is correct, but I think it illustrates the bug in https://github.com/jupyterhub/nbgitpuller/issues/329 https://discourse.jupyter.org/t/nb-on-binder-with-parameters-now-resulting-in-404/21860

 FAIL  js/packages/binderhub-client/lib/index.test.js
  ● Get full redirect URL and deal with query and encoded query (with pathType=url)

    expect(received).toBe(expected) // Object.is equality

    Expected: "https://hub.test-binder.org/user/something/endpoint?a=1/2&b=3%3F%2F&token=token"
    Received: "https://hub.test-binder.org/user/something/endpoint%3Fa=1/2&b=3%3F%2F?token=token"

      179 |       )
      180 |       .toString(),
    > 181 |   ).toBe(
          |     ^
      182 |     "https://hub.test-binder.org/user/something/endpoint?a=1/2&b=3%3F%2F&token=token",
      183 |   );
      184 | });

      at Object.toBe (js/packages/binderhub-client/lib/index.test.js:181:5)
yuvipanda commented 11 months ago

@manics I opened a PR to this PR fixing this: https://github.com/manics/binderhub/pull/8. Well, almost fixing it - you'll see it fails currently because the unencoded / in path does get encoded when we use URL. I'm not sure unencoded / are allowed in queryparams, but clearly our prior implementation did allow them.

So our two options are:

  1. Determine that encoding any / in query params is ok
  2. Use another method that strictly fully preserves path

I think (1) is ok, and we can sort of validate this by looking at the two reported URLs that had problems, and perhaps also look at our analytics event streams for paths with unencoded / in them. What do you think?

manics commented 11 months ago

First of all was my test definitely correct? I wasn't completely sure whether the path (including query) argument of getFullRedirectURL was meant to be encoded or not. If / is meant to be encoded in a query then I think let's assume it is for now since browsers often auto encode/decode some things even something doesn't quite follow the spec.

The main problem is the leading ? being encoded which it looks like you've fixed!

Edit: Feel free to rebase and force push if you want, otherwise I'll do it later

manics commented 11 months ago

Rebased

consideRatio commented 11 months ago

@manics could you update the title of this PR as well to reflect its more than just test related now?

manics commented 11 months ago

Done!

yuvipanda commented 11 months ago

I added an additional test specifically using nbgitpuller, and showing that URL encoding seems to be done now just the right amount, rather than double or none.

yuvipanda commented 11 months ago

ok, I tested this locally. You can first see me try an nbgitpuller link on the main branch, and it fails. Then I switch to this branch, and it succeeds (once if hard refresh the page to get new JS).

https://github.com/jupyterhub/binderhub/assets/30430/925ff703-92e7-4f15-bd31-6a5cb565c6cc

Thank you for working on this @manics and thanks for the review, @consideRatio. I am hoping that my cleanup + automated testing efforts will allow us to build more of these things without breaking.