jupyterhub / mybinder.org-deploy

Deployment config files for mybinder.org
https://mybinder-sre.readthedocs.io/en/latest/index.html
BSD 3-Clause "New" or "Revised" License
76 stars 75 forks source link

CORS issue causing event stream error on gesis #1344

Closed minrk closed 4 years ago

minrk commented 4 years ago

Builds on gesis are failing with "Failed to connect to event stream" due to a CORS issue. The request to open the event stream is failing with:

[Error] Failed to load resource: Cross-origin redirection to https://notebooks.gesis.org/binder/build/gh/TomAugspurger/idp-results/master?binder_launch_host=https%3A%2F%2Fmybinder.org%2F denied by Cross-Origin Resource Sharing policy: Cannot use wildcard in Access-Control-Allow-Origin when credentials flag is true. (master, line 0) [Error] EventSource cannot load https://gesis.mybinder.org/build/gh/TomAugspurger/idp-results/master?binder_launch_host=https%3A%2F%2Fmybinder.org%2F due to access control checks. [Error] Failed to construct event stream – Event {isTrusted: true, type: "error", target: EventSource, …} Event {isTrusted: true, type: "error", target: EventSource, currentTarget: EventSource, eventPhase: 2, …}Event (anonymous function) (bundle.js:21152)

I'm not sure what setting is different on gesis that causes this. We should probably disable gesis until this works out, since all builds will appear to fail (the builds and launches do not appear to actually be affected, only the event stream)

bitnik commented 4 years ago

hi @minrk , can you help me here? I am not sure if I really understand the error. How can I produce it? So I can investigate what is wrong in our deployment.

minrk commented 4 years ago

I'm trying to understand the root cause myself. Binder deployments have Access-Control-Allow-Origin: * to allow anyone to connect. However, there is a conflict between this option and Access-Control-Allow-Credentials: true and/or sending requests withCredentials.

It is There should be tw

minrk commented 4 years ago

What I don't understand is what is different about gesis that is triggering this, why the credentials flag would be different.

I think the best way to test this would be to add gesis to the staging federation so that you can try builds there. It's hard to reproduce when gesis is temporarily out of rotation.

manics commented 4 years ago

@bitnik Did anything change with your authenticated binderhub that might have inadvertently affected the public one?

minrk commented 4 years ago

You can reproduce this in the js console in a browser with:

evt = new EventSource("https://mybinder.org/build/gh/TomAugspurger/idp-results/master", {withCredentials: true})
evt.close()

what's weird is that we don't pass withCredentials: true, and without specifying that, I can't reproduce the issue from the console, even if I specify the exact URL that doesn't work from the same page that doesn't work

betatim commented 4 years ago

One thing that is different for Gesis compared to our other clusters is that there is a nginx which performs a redirect (30x) on requests we make

minrk commented 4 years ago

Ah, that has a very high probability of being it

minrk commented 4 years ago

since CORS queries tend to be made with OPTIONS requests, and may not be redirected the same

minrk commented 4 years ago

what URLs get those redirects?

(edit: apologies for using GitHub comments like a chat app)

betatim commented 4 years ago

https://github.com/gesiscss/orc/blob/master/load_balancer/sites-available/gesis_mybinder is the nginx config running "at" gesis.mybinder.org

betatim commented 4 years ago

Maybe we need to add OPTIONS to https://github.com/gesiscss/orc/blob/f657f2213e5f4d90762ed5311cfb304f280208f0/load_balancer/sites-available/gesis_mybinder#L29?

minrk commented 4 years ago

Yeah, worth a try. I'm not sure if OPTIONS can follow redirects, we may need to write the OPTIONS reply ourselves (it should just set the CORS headers and that's it)

bitnik commented 4 years ago

hey, I am sorry that I couldn't really follow you in this conversation. I even still cant reproduce it myself. According to what you wrote I understand that builds are failing at GESIS when it is done through federation, but I don't experience this neither.

I am adding now OPTIONS as Tim suggested.

arnim commented 4 years ago

Hi, we have just compared @minrk s example (TomAugspurger/idp-results => GKE):

evt = new EventSource("https://staging.mybinder.org/build/gh/TomAugspurger/idp-results/master", {withCredentials: true})
EventSource {url: "https://staging.mybinder.org/build/gh/TomAugspurger/idp-results/master", withCredentials: true, readyState: 0, onopen: null, onmessage: null, …}url: "https://staging.mybinder.org/build/gh/TomAugspurger/idp-results/master"withCredentials: truereadyState: 2onopen: nullonmessage: nullonerror: null__proto__: EventSource
(index):1 Access to resource at 'https://gke.staging.mybinder.org/build/gh/TomAugspurger/idp-results/master?binder_launch_host=https%3A%2F%2Fmybinder.org%2F' (redirected from 'https://staging.mybinder.org/build/gh/TomAugspurger/idp-results/master') from origin 'https://staging.mybinder.org' has been blocked by CORS policy: The value of the 'Access-Control-Allow-Origin' header in the response must not be the wildcard '*' when the request's credentials mode is 'include'.
gke.staging.mybinder.org/build/gh/TomAugspurger/idp-results/master?binder_launch_host=https%3A%2F%2Fmybinder.org%2F:1 GET https://gke.staging.mybinder.org/build/gh/TomAugspurger/idp-results/master?binder_launch_host=https%3A%2F%2Fmybinder.org%2F net::ERR_FAILED

and bitnik/simple-binder-repo-2 => GESIS

evt = new EventSource("https://staging.mybinder.org/build/gh/bitnik/simple-binder-repo-2/master", {withCredentials: true})
EventSource {url: "https://staging.mybinder.org/build/gh/bitnik/simple-binder-repo-2/master", withCredentials: true, readyState: 0, onopen: null, onmessage: null, …}
(index):1 Access to resource at 'https://gesis.mybinder.org/build/gh/bitnik/simple-binder-repo-2/master?binder_launch_host=https%3A%2F%2Fmybinder.org%2F' (redirected from 'https://staging.mybinder.org/build/gh/bitnik/simple-binder-repo-2/master') from origin 'https://staging.mybinder.org' has been blocked by CORS policy: The value of the 'Access-Control-Allow-Origin' header in the response must not be the wildcard '*' when the request's credentials mode is 'include'.
gesis.mybinder.org/build/gh/bitnik/simple-binder-repo-2/master?binder_launch_host=https%3A%2F%2Fmybinder.org%2F:1 GET https://gesis.mybinder.org/build/gh/bitnik/simple-binder-repo-2/master?binder_launch_host=https%3A%2F%2Fmybinder.org%2F net::ERR_FAILED
evt
EventSource {url: "https://staging.mybinder.org/build/gh/bitnik/simple-binder-repo-2/master", withCredentials: true, readyState: 2, onopen: null, onmessage: null, …}

on the first look the seem to behave the same.

betatim commented 4 years ago

Not sure I understand what is going wrong yet. As I was poking around things I noticed that we have https://gke.staging.mybinder.org/build/gh/binder-examples/requirements/master?binder_launch_host=https%3A%2F%2Fmybinder.org%2F. Does someone know why the binder_launch_host query parameter gets added (and where)?

manics commented 4 years ago

Sounds like it's related to this feature added to support Gesis:

bitnik commented 4 years ago

https://github.com/jupyterhub/mybinder.org-deploy/pull/1307:

https://github.com/jupyterhub/mybinder.org-deploy/blob/9cba7bf6f282db3ed2939bd3a47770f3fc5720b5/images/federation-redirect/app.py#L195

It is used to set BINDER_LAUNCH_HOST env variable correctly. If user is redirected from federation BINDER_LAUNCH_HOST is "https://mybinder.org/" or if user is launching a repo directly through GESIS Binder, it is "https://notebooks.gesis.org/binder/".

This env variable is used for the value of "Copy Binder Link" button. And also in https://github.com/manics/jupyter-offlinenotebook extention. That's all I remember.

bitnik commented 4 years ago

Not sure I understand what is going wrong yet.

@betatim does it mean that you can reproduce the error?

betatim commented 4 years ago

What I meant is that I didn't even get close to trying to reproduce it because I ended up with other questions first. I do get the error Min reported if I set withCredentials to true.

I will try again today with a locally running federation redirector which only contains Gesis. Otherwise trying to find a repo that gets redirected to the place you want is a bit tedious :)

Another thing I am unsure about: does the result you get from typing things into the console depend on which website you opened the console for? For example "open console on mybinder.org" vs "open console on gesis.mybinder.org" vs "..."?

minrk commented 4 years ago

FWIW, I'm able to reproduce this currently at https://staging.mybinder.org/v2/gh/binderhub-ci-repos/requirements/master which is currently binned to gesis. I can see it with Chrome and Safari, but not Firefox. Safari gives informative error messages about withCredentials, while Chrome only says "Failed to construct event stream".

When I run the federation-redirector on its own locally (on localhost), there do not appear to be issues. The main difference I can think of is that it's on http not https.

minrk commented 4 years ago

It doesn't make sense to set the binder_launch_host param on all redirected URLs (mostly build URLs), so I'm going to try reverting that part of #1307 to see if it helps.

betatim commented 4 years ago

On Firefox (74.0b4) and Chrome (79.0.3945.130 (Official Build) (64-bit)) I don't see any errors for https://staging.mybinder.org/v2/gh/betatim/first-binder/master.

minrk commented 4 years ago

@betatim I didn't see errors on my very first test with Chrome 79 this morning, but after restarting to update to 80 I did!

bitnik commented 4 years ago

On firefox (68.5.0esr (64-bit)) and chrome (Version 80.0.3987.106 (Official Build) (64-bit)) I don't see any errors neither. Does error occur when only building or also when launching?

minrk commented 4 years ago

Build or launch doesn't matter. Attempting to establish the eventstream connection is failing, so even if it's an invalid ref, it fails.

minrk commented 4 years ago

Removing the url param didn't appear to change things, nor did reverting the redirect to 302.

I am going to note that gesis is the only one that is indeed outside the mybinder.org domain, so maybe that's the difference. This also means it's the only one with two redirects: mybinder.org -> gesis.mybinder.org (inside gke) -> notebooks.gesis.org/binder (happens at gesis). It is always that second redirect that fails, not the first, so I guess the issue resides in the nginx redirect on the gesis side? Or maybe it's purely in the browser policy applying to this one and only truly cross-domain request?

Maybe withCredentials is implicitly true because the initial request is not cross-origin and the default credentials mode is same-origin? I'm flailing a bit here.

betatim commented 4 years ago

Now on Chrome 80 https://staging.mybinder.org/v2/gh/betatim/first-binder/master still works for me after restarting.

OVH also has its own domain .mybinder.ovh, but I think the difference is that that only gets used when we launch the pod?

minrk commented 4 years ago

Based on the error message, this appears to be an issue with the CORS headers on the redirect handler. @bitnik could you change the header on the /build/ redirector to:

add_header "Access-Control-Allow-Origin"  "https://mybinder.org";
add_header "Access-Control-Allow-Credentials"  "false";

I don't think this should affect the real service, just the redirection. But maybe it will affect things like spaCy and thebelab...

minrk commented 4 years ago

@betatim yeah, the redirect to the Hub is not an issue. Turing Hub is also another domain. Redirect for a new page isn't a CORS request like the event stream is.

bitnik commented 4 years ago

@minrk done.

manics commented 4 years ago

There's a warning of breaking changes in Chrome 80 relating to the SameSite attribute https://www.theregister.co.uk/2020/01/30/google_chrome_80_cookies/

manics commented 4 years ago

Note at the end of the article:

Though Chrome 80 is still slated to ship on February 4, 2020, Google now says, "The SameSite-by-default and SameSite=None-requires-Secure behaviors will begin rolling out to Chrome 80 Stable for an initial limited population starting the week of February 17, 2020."

sounds like not all Chrome 80 users will get the new feature immediately, this might explain why it's working for some.

minrk commented 4 years ago

@bitnik silly me, can you make it staging.mybinder.org for now while we are testing there?

bitnik commented 4 years ago

@minrk I also just realized it and changed it

minrk commented 4 years ago

Hunch while looking at the steps: we set the host cookie during the redirect process. I wonder if that has an effect on why it's behaving like the credentials flag is true.

minrk commented 4 years ago

ok, getting new and reasonable(ish) error:

Cross-origin redirection to https://notebooks.gesis.org/binder/build/gh/binderhub-ci-repos/requirements/whatevs denied by Cross-Origin Resource Sharing policy: Credentials flag is true, but Access-Control-Allow-Credentials is not "true".

can you now toggle the Allow-Credentials to "true"?

minrk commented 4 years ago

ok, it works now. Can you finally switch back to Access-Control-Allow-Origin: *, leaving credentials true?

minrk commented 4 years ago

okay, that's back to denied by Cross-Origin Resource Sharing policy: Cannot use wildcard in Access-Control-Allow-Origin when credentials flag is true.

Try with this:

add_header Access-Control-Allow-Origin $http_origin;
add_header Vary Origin;

This is equivalent to Access-Control-Allow-Origin *, but evaluated with a single match per request. Should get the permissiveness we had before, but satisfying the policy issues per-request.

minrk commented 4 years ago

Still a mystery: why are some browsers sometimes sending requests with the credentials flag set? My guess is that it's a cross-origin redirect and the initial request is sent same-origin. Maybe it's relevant that we are setting a header in the middle? The exact circumstances are unclear to me.

minrk commented 4 years ago

It's working on staging, thanks @bitnik! Restoring gesis to rotation in #1356

minrk commented 4 years ago

Not sure about closing this issue or not: we still have a mystery of why requests are being made with credentials to some endpoints and why that's offensive to some browsers (Safari) and other browsers some of the time (Chrome) and not others (Firefox). But we've at least worked out how to keep things working.

bitnik commented 4 years ago

thanks a lot @minrk for fixing this!

betatim commented 4 years ago

A mega debugging thread! I'd suggest closing this and opening a new issue which starts with a summary like this and points here.

meeseeksmachine commented 4 years ago

This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/access-control-allow-origin-header-is-missing-on-mybinder-org/3367/5

betatim commented 4 years ago

I just came across https://mybinder.org/v2/gh/geomar-tm/python-intro-201804/v2.0.0?urlpath=lab which launches on gesis (at the moment) and reliably and instantly says "can't connect to event stream" for me with Firefox 74.0b4 (64-bit). This is what I see in the console:

Screenshot 2020-02-19 at 13 32 52

A few minutes later and on Chrome it "just works"

Though it now gets stuck on this:

Screenshot 2020-02-19 at 13 31 11

Neither "clear workspace" no "keep waiting" do anything other than take me back to the error.

betatim commented 4 years ago

The weird thing is that the launch via Chrome works right now, but even 5min later on Firefox it continues to be broken.

betatim commented 4 years ago

For even more fun: https://mybinder.org/v2/gh/geomar-tm/python-intro-201804/v2.0.0 doesn't work but https://mybinder.org/v2/gh/geomar-tm/python-intro-201804/master works.

The headers look right in both cases.

meeseeksmachine commented 4 years ago

This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/access-control-allow-origin-header-is-missing-on-mybinder-org/3367/9

betatim commented 4 years ago

https://mybinder.org/v2/gh/ogrisel/notebooks/master?filepath=sklearn_demos/ames_housing.ipynb is another link that consistently doesn't work for me :-/

consideRatio commented 4 years ago

I've never really digged down into CORS stuff, so I decided to read up and learn HTTP matters and CORS matters thoroughly. I'm certainly not done yet, but I figured I'd link two resources of relevance to me already!

  1. A longer thorough but readable text about HTTP in general: https://developer.mozilla.org/en-US/docs/Web/HTTP
  2. Nice video about CORS specifically: https://www.youtube.com/watch?v=Ka8vG5miErk

I've already started to become aware of a lot of complexity, such as requests from javascript sometimes goes directly to the server, but sometimes first does a preflight request to verify if the actual request is of a kind that will be considered valid, and that the web-server needs to also respond positively to that. Ah... I feel happy to set out to learn about this thoroughly instead of remaining confused whenever this comes up! :D