trailofbits / algo

Set up a personal VPN in the cloud
https://blog.trailofbits.com/2016/12/12/meet-algo-the-vpn-that-works/
GNU Affero General Public License v3.0
28.87k stars 2.32k forks source link

Unable to access Google or interact with GCE via the Google SDK (gcloud) #210

Closed eureki closed 7 years ago

eureki commented 7 years ago

OS / Environment

Ubuntu 16.04.1 LTS

Ansible version

2.2.0.0

Version of components from requirements.txt

Name: boto Version: 2.38.0 Name: dopy Version: 0.3.5 Name: azure Version: 2.0.0rc5 Name: apache-libcloud Version: 1.5.0 Name: six Version: 1.10.0 Name: pyOpenSSL Version: 16.2.0

Summary of the problem

I am in the process of transitioning from OpenVPN to Algo on a Google Cloud hosted instance. The Algo deployment script has been run locally on the VPN server and configured itself easily - connections can be made to Algo successfully and traffic is routable internally.

The problem is that traffic to .google.com is not working when connected to the Algo VPN. I am also unable to use the Google SDK (gcloud) to interact with the Google Cloud Console. I am able to access other websites (github, bitbucket etc.), and services fine. A traceroute is made to google.com and Google DNS servers (8.8.8.8) without issue, however wget google.com does not work when connected to Algo. I am able to access .google.com and use the Google SDK when connected to OpenVPN as an alternative.

I have tried amending iptables routing rules as per this and this; neither have made Google accessible. All traffic within the VPN Subnet has been opened through a Google Cloud Firewall rule.

Does Algo or the VPN instance need to be configured differently?

Steps to reproduce the behavior

  1. Connect to Algo VPN
  2. gcloud compute ssh algo-vpn (for example), alternatively
  3. Browse to google

The way of deployment (cloud or local)

Local deployment on Google Cloud instance

Expected behavior

Successful connection to the Algo VPN server via the gcloud SDK

Actual behavior

Connection to the server times out. google.com, cloud.google.com are also inaccessible.

Full log

When connected to Algo:

On the VPN instance:

jackivanov commented 7 years ago

I cannot confirm, works well. Maybe related to #185. Try to tweak MTU/MSS

eureki commented 7 years ago

Thank you for your response and assistance.

I've tried tweaking the MTU/MSS as per this article and this one; neither have resolved the issue. Are you able to link to any resources that might solve the problem?

Also, to clarify, I am able to access Google ok from the Algo server, just not from my client (Mac). i.e wget is successful from the server; from my Mac it isn't.

jackivanov commented 7 years ago

Try to use ping with the don't fragment flag and determine the proper MTU size

eureki commented 7 years ago

First, when not connected to the VPN:

$ ping -D -g 1250 -G 1256 google.com PING google.com (216.58.199.46): (1250 ... 1256) data bytes 72 bytes from 216.58.199.46: icmp_seq=0 ttl=55 time=71.832 ms wrong total length 92 instead of 1278 72 bytes from 216.58.199.46: icmp_seq=1 ttl=55 time=76.165 ms wrong total length 92 instead of 1279 72 bytes from 216.58.199.46: icmp_seq=2 ttl=55 time=74.772 ms wrong total length 92 instead of 1280 ping: sendto: Message too long ping: sendto: Message too long Request timeout for icmp_seq 3 ping: sendto: Message too long Request timeout for icmp_seq 4 ping: sendto: Message too long Request timeout for icmp_seq 5

From this I assume the MTU should be 1280. To compare, when connected to the VPN:

$ ping -D -g 1370 -G 1376 google.com PING google.com (74.125.204.100): (1370 ... 1376) data bytes 72 bytes from 74.125.204.100: icmp_seq=0 ttl=52 time=330.455 ms wrong total length 92 instead of 1398 72 bytes from 74.125.204.100: icmp_seq=1 ttl=52 time=212.901 ms wrong total length 92 instead of 1399 72 bytes from 74.125.204.100: icmp_seq=2 ttl=52 time=213.108 ms wrong total length 92 instead of 1400 ping: sendto: Message too long ping: sendto: Message too long Request timeout for icmp_seq 3 ping: sendto: Message too long Request timeout for icmp_seq 4 ping: sendto: Message too long Request timeout for icmp_seq 5

From this I assume the MTU should be set to 1400 for the VPN connection. The following command was executed on the Algo server:

iptables -t mangle -A FORWARD -o eth0 \ -p tcp -m tcp --tcp-flags SYN,RST SYN \ -m tcpmss --mss 1361:1536 \ -j TCPMSS --set-mss 1400

The MTU was then manually set onmy Mac to 1400 with networksetup -setMTU en0 1400.

migueldemoura commented 7 years ago

Had the same issue on Ubuntu 16.10. Fixed it with:

iptables -t mangle -A FORWARD -o eth0 -p tcp -m tcp --tcp-flags SYN,RST SYN -m tcpmss --mss 1361:1536 -j TCPMSS --set-mss 1360

Edit: Having the same problem again.

mutemule commented 7 years ago

@migueldemoura: Can you try dropping the MSS even lower, to see if that works? Try 1200 to start.

You'll probably want to delete existing mangle rules before testing (or modify config.cfg to create a new instance with an appropriate value for max_mss.)

jackivanov commented 7 years ago

MTU problems will be described in 1.1 #216

jhunken commented 7 years ago

I'm having the same issue. First time Algo user. Deployed Algo to Google Cloud. Able to connect to the Algo VPN and browse all sites except *.google.com sites from my Mac client.

Installed with https://github.com/trailofbits/algo/commit/27e0fd073b3e56b64854a24b52a09840ea0c5763

jhunken commented 7 years ago

I just Deployed to AWS EC2 and everything works fine. Seems isolated to Google Cloud deploys.

andrewhowdencom commented 7 years ago

I am experiencing a similar issue, though not quite the same as #210 -- more #310 from @joshwardell. Can connect to the GCE instance, but no sites will respond. Interestingly, tcpdumping the interface on the VPN shows the traffic, but it appears that the traffic doesn't get routed back to my machine. (I think that's correct, I'm somewhat a networking newb)

This was verified by doing a DNS query for a domain that doesn't exist (something like foo.bar.baz), and seeing that head out across the interface -- but not receiving the response on my machine.

To further add to the weirdness, I can establish an SSH connection to the machine while not on the VPN and continue to operate it after I connect to the VPN, but cannot establish a "fresh" connection once connected.

Happy to continue debugging this any further. I am still looking at this as I have time, but figured I'd add my notes here.

dguido commented 7 years ago

It would help a lot if you all could start writing up the troubleshooting solutions that work on #216. That will eventually result in documentation we can publish.

mrebersv commented 7 years ago

Was linked here through a series of closed tickets, though I question that 210 is actually the same as 310/345/whatever else.

Just to add as another data point, I'm experiencing the same GCE issues as others with my MTU set artificially low. Nother ever above 1400, but tried several steps down to 1280. Cannot get ICMP responses nor DNS responses. Haven't tried anything else since DNS won't work.

devries commented 7 years ago

I have run into this issue connecting to certain sites, specifically google sites, through Algo VPN when I run it on GCE. I am using a Mac with OS 10.11.6 as the client, and did an install on an existing google compute engine instance running ubuntu 16.04 LTS. I tried a couple of things. First, with ICMP connections open to the server, when activating the VPN using the mobileconfig file I find my ipsec0 network device has an MTU of 1400. Running some ping sweeping I found that the maximum MTU accepted for a ping through ipsec0 to google.com is 1372.

I set the ipsec0 mtu to 1372 and everything works fine. I then tried blocking ICMP to the algo server, to see how it would default without any ICMP information. I found that ipsec0 set the MTU to 1500 in that case, which of course is worse. I fiddled with trying:

sudo iptables -I FORWARD -p tcp --tcp-flags SYN,RST SYN -j TCPMSS  --clamp-mss-to-pmtu

but that did not fix it. I tried a setting up Algo on a new digitalocean server, using the remote ansible commands and everything works fine through the digitalocean server, including google sites.

tl;dr - My report generally confirms the issue others are seeing with GCE specific installation.

migueldemoura commented 7 years ago

@mutemule, still no luck. Moved to DO in the meanwhile and have had no issues so far.

elvizlai commented 7 years ago

set client ipsec0 mtu to 1360 solved this problem.

ifconfig ipsec0 mtu 1360

0x1ee7 commented 7 years ago

Running strongswan in docker container on GCE and @elvizlai s suggestion worked for me. Now how can I do the same for iOS. Is it possible to set MTU on tunnel setup by the server?

mptorz commented 5 years ago

I solved the problem on all my computers, but I cannot change the MTU setting on mobile devices (like iOS). Is there any workaround or should I just switch to amazon or digitial ocean?

Btw. I deployed with a default #max_mss setting which was 1316. I don't know if tweaking that would make any difference.