Closed minrk closed 4 years ago
hi @minrk , can you help me here? I am not sure if I really understand the error. How can I produce it? So I can investigate what is wrong in our deployment.
I'm trying to understand the root cause myself. Binder deployments have Access-Control-Allow-Origin: *
to allow anyone to connect. However, there is a conflict between this option and Access-Control-Allow-Credentials: true
and/or sending requests withCredentials
.
It is There should be tw
What I don't understand is what is different about gesis that is triggering this, why the credentials flag would be different.
I think the best way to test this would be to add gesis to the staging federation so that you can try builds there. It's hard to reproduce when gesis is temporarily out of rotation.
@bitnik Did anything change with your authenticated binderhub that might have inadvertently affected the public one?
You can reproduce this in the js console in a browser with:
evt = new EventSource("https://mybinder.org/build/gh/TomAugspurger/idp-results/master", {withCredentials: true})
evt.close()
what's weird is that we don't pass withCredentials: true
, and without specifying that, I can't reproduce the issue from the console, even if I specify the exact URL that doesn't work from the same page that doesn't work
One thing that is different for Gesis compared to our other clusters is that there is a nginx which performs a redirect (30x) on requests we make
Ah, that has a very high probability of being it
since CORS queries tend to be made with OPTIONS requests, and may not be redirected the same
what URLs get those redirects?
(edit: apologies for using GitHub comments like a chat app)
https://github.com/gesiscss/orc/blob/master/load_balancer/sites-available/gesis_mybinder is the nginx config running "at" gesis.mybinder.org
Maybe we need to add OPTIONS to https://github.com/gesiscss/orc/blob/f657f2213e5f4d90762ed5311cfb304f280208f0/load_balancer/sites-available/gesis_mybinder#L29?
Yeah, worth a try. I'm not sure if OPTIONS can follow redirects, we may need to write the OPTIONS reply ourselves (it should just set the CORS headers and that's it)
hey, I am sorry that I couldn't really follow you in this conversation. I even still cant reproduce it myself. According to what you wrote I understand that builds are failing at GESIS when it is done through federation, but I don't experience this neither.
I am adding now OPTIONS as Tim suggested.
Hi, we have just compared @minrk s example (TomAugspurger/idp-results => GKE):
evt = new EventSource("https://staging.mybinder.org/build/gh/TomAugspurger/idp-results/master", {withCredentials: true})
EventSource {url: "https://staging.mybinder.org/build/gh/TomAugspurger/idp-results/master", withCredentials: true, readyState: 0, onopen: null, onmessage: null, …}url: "https://staging.mybinder.org/build/gh/TomAugspurger/idp-results/master"withCredentials: truereadyState: 2onopen: nullonmessage: nullonerror: null__proto__: EventSource
(index):1 Access to resource at 'https://gke.staging.mybinder.org/build/gh/TomAugspurger/idp-results/master?binder_launch_host=https%3A%2F%2Fmybinder.org%2F' (redirected from 'https://staging.mybinder.org/build/gh/TomAugspurger/idp-results/master') from origin 'https://staging.mybinder.org' has been blocked by CORS policy: The value of the 'Access-Control-Allow-Origin' header in the response must not be the wildcard '*' when the request's credentials mode is 'include'.
gke.staging.mybinder.org/build/gh/TomAugspurger/idp-results/master?binder_launch_host=https%3A%2F%2Fmybinder.org%2F:1 GET https://gke.staging.mybinder.org/build/gh/TomAugspurger/idp-results/master?binder_launch_host=https%3A%2F%2Fmybinder.org%2F net::ERR_FAILED
and bitnik/simple-binder-repo-2 => GESIS
evt = new EventSource("https://staging.mybinder.org/build/gh/bitnik/simple-binder-repo-2/master", {withCredentials: true})
EventSource {url: "https://staging.mybinder.org/build/gh/bitnik/simple-binder-repo-2/master", withCredentials: true, readyState: 0, onopen: null, onmessage: null, …}
(index):1 Access to resource at 'https://gesis.mybinder.org/build/gh/bitnik/simple-binder-repo-2/master?binder_launch_host=https%3A%2F%2Fmybinder.org%2F' (redirected from 'https://staging.mybinder.org/build/gh/bitnik/simple-binder-repo-2/master') from origin 'https://staging.mybinder.org' has been blocked by CORS policy: The value of the 'Access-Control-Allow-Origin' header in the response must not be the wildcard '*' when the request's credentials mode is 'include'.
gesis.mybinder.org/build/gh/bitnik/simple-binder-repo-2/master?binder_launch_host=https%3A%2F%2Fmybinder.org%2F:1 GET https://gesis.mybinder.org/build/gh/bitnik/simple-binder-repo-2/master?binder_launch_host=https%3A%2F%2Fmybinder.org%2F net::ERR_FAILED
evt
EventSource {url: "https://staging.mybinder.org/build/gh/bitnik/simple-binder-repo-2/master", withCredentials: true, readyState: 2, onopen: null, onmessage: null, …}
on the first look the seem to behave the same.
Not sure I understand what is going wrong yet. As I was poking around things I noticed that we have https://gke.staging.mybinder.org/build/gh/binder-examples/requirements/master?binder_launch_host=https%3A%2F%2Fmybinder.org%2F
. Does someone know why the binder_launch_host
query parameter gets added (and where)?
Sounds like it's related to this feature added to support Gesis:
https://github.com/jupyterhub/mybinder.org-deploy/pull/1307:
It is used to set BINDER_LAUNCH_HOST
env variable correctly. If user is redirected from federation BINDER_LAUNCH_HOST
is "https://mybinder.org/" or if user is launching a repo directly through GESIS Binder, it is "https://notebooks.gesis.org/binder/".
This env variable is used for the value of "Copy Binder Link" button. And also in https://github.com/manics/jupyter-offlinenotebook extention. That's all I remember.
Not sure I understand what is going wrong yet.
@betatim does it mean that you can reproduce the error?
What I meant is that I didn't even get close to trying to reproduce it because I ended up with other questions first. I do get the error Min reported if I set withCredentials
to true.
I will try again today with a locally running federation redirector which only contains Gesis. Otherwise trying to find a repo that gets redirected to the place you want is a bit tedious :)
Another thing I am unsure about: does the result you get from typing things into the console depend on which website you opened the console for? For example "open console on mybinder.org" vs "open console on gesis.mybinder.org" vs "..."?
FWIW, I'm able to reproduce this currently at https://staging.mybinder.org/v2/gh/binderhub-ci-repos/requirements/master which is currently binned to gesis. I can see it with Chrome and Safari, but not Firefox. Safari gives informative error messages about withCredentials, while Chrome only says "Failed to construct event stream".
When I run the federation-redirector on its own locally (on localhost), there do not appear to be issues. The main difference I can think of is that it's on http not https.
It doesn't make sense to set the binder_launch_host param on all redirected URLs (mostly build URLs), so I'm going to try reverting that part of #1307 to see if it helps.
On Firefox (74.0b4) and Chrome (79.0.3945.130 (Official Build) (64-bit)) I don't see any errors for https://staging.mybinder.org/v2/gh/betatim/first-binder/master.
@betatim I didn't see errors on my very first test with Chrome 79 this morning, but after restarting to update to 80 I did!
On firefox (68.5.0esr (64-bit)) and chrome (Version 80.0.3987.106 (Official Build) (64-bit)) I don't see any errors neither. Does error occur when only building or also when launching?
Build or launch doesn't matter. Attempting to establish the eventstream connection is failing, so even if it's an invalid ref, it fails.
Removing the url param didn't appear to change things, nor did reverting the redirect to 302.
I am going to note that gesis is the only one that is indeed outside the mybinder.org domain, so maybe that's the difference. This also means it's the only one with two redirects: mybinder.org -> gesis.mybinder.org (inside gke) -> notebooks.gesis.org/binder (happens at gesis). It is always that second redirect that fails, not the first, so I guess the issue resides in the nginx redirect on the gesis side? Or maybe it's purely in the browser policy applying to this one and only truly cross-domain request?
Maybe withCredentials
is implicitly true because the initial request is not cross-origin and the default credentials mode is same-origin? I'm flailing a bit here.
Now on Chrome 80 https://staging.mybinder.org/v2/gh/betatim/first-binder/master still works for me after restarting.
OVH also has its own domain .mybinder.ovh
, but I think the difference is that that only gets used when we launch the pod?
Based on the error message, this appears to be an issue with the CORS headers on the redirect handler. @bitnik could you change the header on the /build/
redirector to:
add_header "Access-Control-Allow-Origin" "https://mybinder.org";
add_header "Access-Control-Allow-Credentials" "false";
I don't think this should affect the real service, just the redirection. But maybe it will affect things like spaCy and thebelab...
@betatim yeah, the redirect to the Hub is not an issue. Turing Hub is also another domain. Redirect for a new page isn't a CORS request like the event stream is.
@minrk done.
There's a warning of breaking changes in Chrome 80 relating to the SameSite
attribute https://www.theregister.co.uk/2020/01/30/google_chrome_80_cookies/
Note at the end of the article:
Though Chrome 80 is still slated to ship on February 4, 2020, Google now says, "The SameSite-by-default and SameSite=None-requires-Secure behaviors will begin rolling out to Chrome 80 Stable for an initial limited population starting the week of February 17, 2020."
sounds like not all Chrome 80 users will get the new feature immediately, this might explain why it's working for some.
@bitnik silly me, can you make it staging.mybinder.org for now while we are testing there?
@minrk I also just realized it and changed it
Hunch while looking at the steps: we set the host
cookie during the redirect process. I wonder if that has an effect on why it's behaving like the credentials flag is true.
ok, getting new and reasonable(ish) error:
Cross-origin redirection to https://notebooks.gesis.org/binder/build/gh/binderhub-ci-repos/requirements/whatevs denied by Cross-Origin Resource Sharing policy: Credentials flag is true, but Access-Control-Allow-Credentials is not "true".
can you now toggle the Allow-Credentials to "true"?
ok, it works now. Can you finally switch back to Access-Control-Allow-Origin: *
, leaving credentials true?
okay, that's back to denied by Cross-Origin Resource Sharing policy: Cannot use wildcard in Access-Control-Allow-Origin when credentials flag is true.
Try with this:
add_header Access-Control-Allow-Origin $http_origin;
add_header Vary Origin;
This is equivalent to Access-Control-Allow-Origin *
, but evaluated with a single match per request. Should get the permissiveness we had before, but satisfying the policy issues per-request.
Still a mystery: why are some browsers sometimes sending requests with the credentials flag set? My guess is that it's a cross-origin redirect and the initial request is sent same-origin. Maybe it's relevant that we are setting a header in the middle? The exact circumstances are unclear to me.
It's working on staging, thanks @bitnik! Restoring gesis to rotation in #1356
Not sure about closing this issue or not: we still have a mystery of why requests are being made with credentials to some endpoints and why that's offensive to some browsers (Safari) and other browsers some of the time (Chrome) and not others (Firefox). But we've at least worked out how to keep things working.
thanks a lot @minrk for fixing this!
A mega debugging thread! I'd suggest closing this and opening a new issue which starts with a summary like this and points here.
This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:
https://discourse.jupyter.org/t/access-control-allow-origin-header-is-missing-on-mybinder-org/3367/5
I just came across https://mybinder.org/v2/gh/geomar-tm/python-intro-201804/v2.0.0?urlpath=lab which launches on gesis (at the moment) and reliably and instantly says "can't connect to event stream" for me with Firefox 74.0b4 (64-bit). This is what I see in the console:
A few minutes later and on Chrome it "just works"
Though it now gets stuck on this:
Neither "clear workspace" no "keep waiting" do anything other than take me back to the error.
The weird thing is that the launch via Chrome works right now, but even 5min later on Firefox it continues to be broken.
For even more fun: https://mybinder.org/v2/gh/geomar-tm/python-intro-201804/v2.0.0 doesn't work but https://mybinder.org/v2/gh/geomar-tm/python-intro-201804/master works.
The headers look right in both cases.
This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:
https://discourse.jupyter.org/t/access-control-allow-origin-header-is-missing-on-mybinder-org/3367/9
https://mybinder.org/v2/gh/ogrisel/notebooks/master?filepath=sklearn_demos/ames_housing.ipynb is another link that consistently doesn't work for me :-/
I've never really digged down into CORS stuff, so I decided to read up and learn HTTP matters and CORS matters thoroughly. I'm certainly not done yet, but I figured I'd link two resources of relevance to me already!
I've already started to become aware of a lot of complexity, such as requests from javascript sometimes goes directly to the server, but sometimes first does a preflight request to verify if the actual request is of a kind that will be considered valid, and that the web-server needs to also respond positively to that. Ah... I feel happy to set out to learn about this thoroughly instead of remaining confused whenever this comes up! :D
Builds on gesis are failing with "Failed to connect to event stream" due to a CORS issue. The request to open the event stream is failing with:
I'm not sure what setting is different on gesis that causes this. We should probably disable gesis until this works out, since all builds will appear to fail (the builds and launches do not appear to actually be affected, only the event stream)