vimeo / graph-explorer

A graphite dashboard powered by structured metrics
http://vimeo.github.io/graph-explorer/
Apache License 2.0
1.06k stars 93 forks source link

Access-Control-Allow-Origin Error #41

Closed Ortimus closed 10 years ago

Ortimus commented 11 years ago

Access-Control-Allow-Origin Error:

We keep getting CORS errors when graph-explorer tries to communicate with the graphite server. XMLHttpRequest cannot load http://:9080/render/. Origin http://:8080 is not allowed by Access-Control-Allow-Origin.

So we have graph-explorer running and can see the metrics but the graphs don't render.

graph-explorer and graphite are on the same machine on different ports.

We have added the following to /etc/apache2/sites-available/graphite: Header set Access-Control-Allow-Origin "*" Header set Access-Control-Allow-Methods "GET, OPTIONS, POST, HEAD, PUT, DELETE" Header set Access-Control-Allow-Headers "X-Requested-With, X-Requested-By, Origin, Authorization, Accept, Content-Type, Pragma"

We have tried putting it directly in the VirtualHost section, in <Location "/"> , <Location "/render"> and nothing has worked. We are not super experienced in configuring apache - so we must be missing something here and will appreciate any help.

Dieterbe commented 11 years ago

why POST, HEAD, PUT, DELETE? and that headers list seems fairly extensive too.

did you try just using the directives from the readme? (putting them in the VirtualHost section)

Ortimus commented 11 years ago

We did use the directives from the readme:

Header set Access-Control-Allow-Origin "*"
Header set Access-Control-Allow-Methods "GET, OPTIONS"
Header set Access-Control-Allow-Headers "origin, authorization, accept"

That didn't work - so we tried to add some more options based on Chrome JS Console showing the calls that were hanging are POST's

render/   POST   (canceled)    Pending    jquery-1.7.2.js:8240 Script

We added the other methods / headers based on some other suggestions we received.

This is our /etc/apache2/sites-available/graphite file:

WSGISocketPrefix /var/run/apache2/wsgi

<VirtualHost *:9080>
        ServerName graphite
        DocumentRoot "/opt/graphite/webapp"
        ErrorLog /var/log/apache2/graphite.error.log
        CustomLog /var/log/apache2/graphite.access.log common
        # LogLevel info

        Header set Access-Control-Allow-Origin "*"
        Header set Access-Control-Allow-Methods "GET, OPTIONS, POST, HEAD, PUT, DELETE"
        Header set Access-Control-Allow-Headers "X-Requested-With, X-Requested-By,  Origin, Authorization, Accept, Content-Type, Pragma"

        # Header set Access-Control-Allow-Origin "*"
        # Header set Access-Control-Allow-Methods "GET, OPTIONS"
        # Header set Access-Control-Allow-Headers "Origin, Authorization, Accept"

        # I've found that an equal number of processes & threads tends
        # to show the best performance for Graphite (ymmv).
        WSGIDaemonProcess graphite processes=5 threads=5 display-name='%{GROUP}' inactivity-timeout=120
        WSGIProcessGroup graphite
        WSGIApplicationGroup %{GLOBAL}
        WSGIImportScript /opt/graphite/conf/graphite.wsgi process-group=graphite application-group=%{GLOBAL}

        # XXX You will need to create this file! There is a graphite.wsgi.example
        # file in this directory that you can safely use, just copy it to graphite.wgsi
        WSGIScriptAlias / /opt/graphite/conf/graphite.wsgi

        Alias /content/ /opt/graphite/webapp/content/
        <Location "/content/">
                SetHandler None
        </Location>

        # XXX In order for the django admin site media to work you
        # must change @DJANGO_ROOT@ to be the path to your django
        # installation, which is probably something like:
        # /usr/lib/python2.6/site-packages/django
        Alias /media/ "@DJANGO_ROOT@/contrib/admin/media/"
        <Location "/media/">
                SetHandler None
        </Location>

        # The graphite.wsgi file has to be accessible by apache. It won't
        # be visible to clients because of the DocumentRoot though.
        <Directory /opt/graphite/conf/>
                 Order deny,allow
                Allow from all
        </Directory>

</VirtualHost>

And our graphite.wsgi is:

import os, sys
sys.path.append('/opt/graphite/webapp')
os.environ['DJANGO_SETTINGS_MODULE'] = 'graphite.settings'

import django.core.handlers.wsgi

application = django.core.handlers.wsgi.WSGIHandler()

# READ THIS
# Initializing the search index can be very expensive, please include
# the WSGIScriptImport directive pointing to this script in your vhost
# config to ensure the index is preloaded before any requests are handed
# to the process.
from graphite.logger import log
log.info("graphite.wsgi - pid %d - reloading search index" % os.getpid())
import graphite.metrics.search
Dieterbe commented 11 years ago

ah yes, GE def. does POST requests. i'll update the README. this is about as far as my knowledge goes though, sorry.

Ortimus commented 11 years ago

Np. We'll try to figure out what apache is doing in combination with Wsgi/Django - at least for POSTs.

Note that when we use a rickshaw/d3 based tool like giraffe (https://github.com/kenhub/giraffe) we don't have the CORS issue and giraffe can read data from our graphite server. But that's using GET/jsonp. This might mean POST/json on the GE side might be causing this. Hopefully someone can duplicate/fix the issue as GE looks like a really promising tool.

Dieterbe commented 11 years ago

the short answer is GE (actually timeserieswidget) needs to do POST requests and jsonp doesn't work with that. the longer more technical explanation is here https://github.com/vimeo/timeserieswidget/commit/44f2aef50b9d4a8ad23eab3d9129b898068a4ad8 (also jquery is notorious for having a memory leak with jsonp). using json however does mean we have the annoying CORS stuff.

Ortimus commented 11 years ago

Thanks for the details. We were adding the cross-origin header information to /etc/apache2/graphite instead of /etc/apache2/default. One of my colleagues figured this out Now we are up and running with graphite-explorer.

Dieterbe commented 11 years ago

so could you share what the actual problem was?

zehome commented 11 years ago

What do you think about adding a reverse proxy inside GE ? that's very easy, very small, more secure and easier for the user (plug & play).

I've done something like that using httlib2 in django for my fork see https://gist.github.com/zehome/6213872

Ortimus commented 11 years ago

Our graphite server configuration file in apache (/etc/apache2/graphite) has

<VirtualHost *:9080>  
...

Initially, we configured graph-explorer 's config.py to point to http://:9080 because that's our graphite web url. But it looked like the POST's were directly going to the python server bypassing apache cross origin headers.

Then we added the cross origin headers in /etc/apache2/default. Note that /etc/apache2/default defines

<VirtualHost *:80> 
...

So we had to re-configure graph-explorer 's config.py to point to http://:80

With this setup GE is working (so are other tools like cubism that use POST/json).

Dieterbe commented 11 years ago

What do you think about adding a reverse proxy inside GE ? that's very easy, very small, more secure and easier for the user (plug & play).

does this guarantee that it will work, even if GE is installed on a different machine than graphite? the servers just need to be able to communicate? it seems CORS is mostly a clientside/javascript thing, so that sounds like a pretty good option. I guess the network overhead should be neglible

zehome commented 11 years ago

Yes, the server will do the connection to graphite. Basically graphite can be listening on a private network (like 127.0.0.1 if hosted on the same machine), not accessible, but GE can be wide open.

It would avoid the need for setting up those headers, which can be a pain in the butt when you use something like gunicorn to serve graphite webservices. (Like hacking/writing a custom middleware inside graphite)

I think it's also prettier to connect only to one server client side.

Dieterbe commented 11 years ago

I think it's also prettier to connect only to one server client side.

agreed. that said I want to avoid the added latency. I would consider it if we can get it working so the proxy streams the connection in realtime instead of buffering it first. see also https://github.com/obfuscurity/descartes/pull/107

zehome commented 11 years ago

that's pretty easy to do, but I don't think it will improve anything:

Or maybe I've forgotten something?

zehome commented 11 years ago

Example code which does this:

import sys
import urllib2

url = """http://proof.ovh.net/files/100Mio.dat"""

BLOCKSIZE = 1024 * 32
f = urllib2.urlopen(url)
while True:
    data = f.read(BLOCKSIZE)
    if not data:
        break
    sys.stdout.write(".")
    sys.stdout.flush()
print ""
print "End of transfer."
Dieterbe commented 11 years ago

sure @zehome tswidget needs the entire response before it starts processing, but if the proxy needs to buffer the entire request, then this is still a slowdown (latency hit), actually a proxy is probably a latency hit even when it just streams the request through. but if it buffers it gets even worse. the example you gave streams through write? i'll have to test it and see

zehome commented 11 years ago

read streaming. You can write streaming, just teplace sys.stdout.write(".") with sys.stdout.write(data) but the real problem is probably bottle. Many frameworks are not capable of serving streaming requests..

Dieterbe commented 10 years ago

I recently added a proxy endpoint a8f46a743bccd075caddcef824f73060b5f78a32, but I just pushed a fix that makes it really clear if there's any CORS errors, they will show up nicely in the UI. see ba4d88da23f68c74737d1eaf4013f155c791d85b

since setting up cors is fairly easy, i prefer that option, a proxy would add latency. so i'm gonna keep it like this for now