Open rwoodpecker opened 8 years ago
I use an ssh tunnel for this:
autossh -f -C -L 127.0.0.1:29000:127.0.0.1:29000 user@hostname -N
Remember to use GRAB_SITE_INTERFACE=127.0.0.1
as well when running gs-server
to avoid leaking the dashboard to everyone.
If an SSH tunnel does not work for your use case, try looking at the nginx error log (and perhaps increasing verbosity?). Since grab-site 0.11, HTTP/1.0 does not work and you might need to configure nginx to use HTTP/1.1 for talking to the backend webserver (grab-site).
(2017 edit: I now use WireGuard instead of SSH tunnels and I recommend it.)
Hmm, actually, an nginx setup might be tricky because you would also have to reverse-proxy the WebSocket endpoint: http://nginx.org/en/docs/http/websocket.html - which I have not tested at all.
It shouldn't be too difficult to add SSL support to gs-server
. It looks like create_server here just needs to get passed an SSL context: https://docs.python.org/3/library/asyncio-eventloop.html#creating-listening-connections
This would not provide as much security as an SSH tunnel, though, because the SSL security model is broken without restrictive CA or certificate pinning.
I'm not really too concerned about the connection getting MITMed or anything like that with my own certificate (and I'm sure/hope most users wanting to use SSL would be aware of this limitation).
Direct support in gs-server would be lovely! I'l definitely look into the SSH tunnel though, thanks.
I've been looking at adding TLS support to grab-site. I believe that it would require 3 things to work:
gs-server
after it is passed the appropriate certificate file.wpull_hooks.py
(probably an environmental variable) so that the method connect_to_server
uses an SSLContext on the actual connection, if detected.ws:
or wss:
depending on whether location.protocol
is http:
or https:
, respectivelyThat sounds about right.
Users should be able to have gs-server
listen on both TCP and SSL since they might want to avoid doing SSL between grab-site
instances and gs-server
, or because they have old TCP-only crawls running.
dashboard's ?host=
might need to renamed to ?ws=
and take either a ws://
or wss://
URI.
Python is not low enough level to listen for both SSL and non-SSL on the same port. Additionally, you cannot listen on both 0.0.0.0
and 127.0.0.1
on the same port at the same time.
Solutions seem to be 1.) implementing https://github.com/ludios/grab-site/issues/86 in conjunction with manually managing the Ethernet interface address or using unix sockets. Or 2.) using python 3.5 which should, according to the docs (I haven't actually tested this), allow 2 programs to listen on the same port when set up properly, but even then that is a gamble as to which program the kernel connects you to. And even then you would need to --upstream
to the other gs-server.
I never meant that the TCP and SSL listener have to be on the same port, just that it should be possible to have both of them running on different ports.
This is not very hard to achieve with nginx and works with https. The following config makes grab-site/ available at "your-address/grab-site/"
location /grab-site/ {
proxy_pass http://localhost:29000/;
proxy_set_header Host $http_host;
# These 3 lines are optional
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Authorization "";
# Required for the web socket
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
Security warning: Not authorized! You should use something like this for security.
auth_basic "Login required";
auth_basic_user_file /etc/htpasswd;
You also have to patch libgrabsite/dashboard.html. Maybe this should be upstreamed. I see no disadvantages doing this: The path for the websocket is wrong. Will be hostname/stream.
Change line 1455 from this.host = location.host
to this.host = location.toString().replace(/^.*\/\//, "")
then the web socket will connect to "your-host/grab-site/stream" and work :)
I referred to this in #192.
I believe the quickest way to do it with most flexibility would be to use nginx sitting in front of the dashboard. It wouldn't need very little change to the application, only need to document its usage in an example.
An upstream patch that dynamically figures out what the correct host name is best. However, I just wanted to throw another idea out there and show that it isn't strictly required for support to be achieved.
You can also use something like sub filter which also works with other apps, so you can take this concept and apply it to other apps in a similar way.
That's not to say there aren't any benefits to TLS termination within the server but I don't think dashboard would benefit from complicating the implementation with direct support, at this time.
I know this is a bit of an indulgent question, but I'm looking to monitor the dashboard on a fairly restrictive network and am keen for the dashboard of grab-site to use HTTPS and so I don't have to use a VPN just for the sake of the dashboard.
I've tried a few different reverse proxies using NGINX and Caddy but I just seem to be served a blank page from grab-site. I figure the (python?) webserver grab-site uses might not be supported too well to be reverse proxied by a traditional web server, but that's a bit beyond my scope. I just wanted to see if anyone had managed to get this to work? Or is it trivial to insert a SSL certificate to be used directly by the grab-site dashboard?
Thanks!