Open nbriozzo opened 10 years ago
On Mon, 15 Sep 2014, nbriozzo wrote:
So the issue is the following: I have several Ubuntu servers running on a AWS environment and i have installed one with OpenSwan so other users may login into them via VPN (using ssh from their laptops, most of them are Airbooks)
One issue i have noted is that while they are logged via VPN and run a command that has a lot of output (like a find /), the screen will eventually show this message and break the ssh connection:
Corrupted MAC on input. Disconnecting: Packet corrupt
This is not totally preventing them to do their jobs, but it is fairly annoying. As an additional data, i have logged in into the servers through an external/elastic IP instead of the VPN and i'm able to see all the output of all commands (also, the connection/response seems faster at this way) so it would see to be that the VPN is somewhat responsible.
Most likely a fragmentation/MTU issue. Try adding iptables rules for TCPMSS clamping. See
https://libreswan.org/wiki/FAQ#My_ssh_sessions_hang_or_connectivity_is_very_slow
Paul
Won't do, I've tried both commands and lowering MTU values, restarting OpenSwan + Networking services and still got the same error.
As an additional data, i've tried to login via VPN with a Windows OS and the error won't happen, i'm still having the issue with MacOS.
I'm not sure what the problem is than. You can try the openswan rhel package, or the libreswan package and see if the problem goes away? I haven't run upstream openswan in a few years, and it is known to have done some incomplete backports from libreswan.
@nbriozzo You are not alone with this one. At work myself and three of my colleagues have had this issue for many months, it started after upgrading to OS X Mavericks on our Macbook Pros (retina). There is one other member of our team who decided not to upgrade due to the issues we were having, he can still use the VPNs fine using Mountain Lion.
We have a lot of servers in AWS VPCs which can only be SSH'd to via their private IPs which needs us to first connect to an Openswan VPN. We have both Openswan and more recently Libreswan VPNs on both CentOS and Amazon Linux (RHEL based). Anything over around 140k is too large for us to scp of the boxes and commands that output a large amount often dumps us out with the corrupt MAC issue.
We have had to resort to pushing files onto S3 and then retrieving them from there or even shudder using a Windows VM using VMware Fusion which seems to work fine. At first we suspected Apple had broke the NIC driver but we've tested using both wifi and thunderbolt and also the Windows VM working confused us when we realised that still worked. One member of the team who is a bit more 'networky' tried lowering settings such as MTU and half-duplex on the Apple NIC and got some improvements but still unusable.
After becoming frustrated with this again recently I discovered this issue and decided to poke around at this again. During my latest test I connected to a Libreswan VPN and noticed that the MTU of ppp0
on my Macbook was set to 1280
, I then noticed that the MTU on the actual VPN server was different, full output of both below:
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384
options=3<RXCSUM,TXCSUM>
inet6 ::1 prefixlen 128
inet 127.0.0.1 netmask 0xff000000
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1
nd6 options=1<PERFORMNUD>
gif0: flags=8010<POINTOPOINT,MULTICAST> mtu 1280
stf0: flags=0<> mtu 1280
en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
ether 20:c9:d0:7d:05:f1
inet6 fe80::22c9:d0ff:fe7d:5f1%en0 prefixlen 64 scopeid 0x5
inet 192.168.1.7 netmask 0xffffff00 broadcast 192.168.1.255
nd6 options=1<PERFORMNUD>
media: autoselect
status: active
en3: flags=8963<UP,BROADCAST,SMART,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500
options=60<TSO4,TSO6>
ether 32:00:12:5c:a5:a0
media: autoselect <full-duplex>
status: inactive
en4: flags=8963<UP,BROADCAST,SMART,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500
options=60<TSO4,TSO6>
ether 32:00:12:5c:a5:a1
media: autoselect <full-duplex>
status: inactive
bridge0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
options=63<RXCSUM,TXCSUM,TSO4,TSO6>
ether aa:20:66:83:da:00
Configuration:
id 0:0:0:0:0:0 priority 0 hellotime 0 fwddelay 0
maxage 0 holdcnt 0 proto stp maxaddr 100 timeout 1200
root id 0:0:0:0:0:0 priority 0 ifcost 0 port 0
ipfilter disabled flags 0x2
member: en3 flags=3<LEARNING,DISCOVER>
ifmaxaddr 0 port 6 priority 0 path cost 0
member: en4 flags=3<LEARNING,DISCOVER>
ifmaxaddr 0 port 7 priority 0 path cost 0
nd6 options=1<PERFORMNUD>
media: <unknown type>
status: inactive
p2p0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 2304
ether 02:c9:d0:7d:05:f1
media: autoselect
status: inactive
vboxnet0: flags=8842<BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
ether 0a:00:27:00:00:00
vboxnet1: flags=8842<BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
ether 0a:00:27:00:00:01
vboxnet2: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500
ether 0a:00:27:00:00:02
inet 192.168.59.3 netmask 0xffffff00 broadcast 192.168.59.255
ppp0: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> mtu 1280
inet 192.168.100.2 --> 192.168.100.1 netmask 0xffffff00
eth0 Link encap:Ethernet HWaddr 06:57:AF:B3:F3:40
inet addr:10.181.198.75 Bcast:10.181.198.127 Mask:255.255.255.128
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:210169 errors:0 dropped:0 overruns:0 frame:0
TX packets:112192 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:216465160 (206.4 MiB) TX bytes:39148411 (37.3 MiB)
Interrupt:28
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:424 errors:0 dropped:0 overruns:0 frame:0
TX packets:424 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:63788 (62.2 KiB) TX bytes:63788 (62.2 KiB)
ppp0 Link encap:Point-to-Point Protocol
inet addr:192.168.100.1 P-t-P:192.168.100.2 Mask:255.255.255.255
UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1410 Metric:1
RX packets:66 errors:0 dropped:0 overruns:0 frame:0
TX packets:44 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:3
RX bytes:6145 (6.0 KiB) TX bytes:6336 (6.1 KiB)
I'm not sure where you change the MTU on ppp0
on the Macbook so I changed the Libreswan server to match by changing /etc/ppp/options.xl2tpd
to:
ipcp-accept-local
ipcp-accept-remote
ms-dns $VPN_DNSHOST
noccp
auth
crtscts
idle 1800
mtu 1280
mru 1280
nodefaultroute
debug
lock
proxyarp
connect-delay 5000
logfile /var/log/ppp.log
After kicking the xl2ptd
service I then managed to download a 100MB file from one of our VPC instances via the the VPN connection. I still saw a corrupt MAC error on the first attempt but as I mentioned before anything over 140k has been too big in the past and the 2nd attempt worked fine so this is a vast improvement and shows I'm on the right track (hopefully).
@letoams I have posted a gist of the script that sets up our VPNs, hopefully this shines some light on our exact setup and maybe we are doing something stupid: https://gist.github.com/rosstimson/09313902bedb6e5a7c7b
I'll test this some more when in the office tomorrow but hopefully this gives some more clues as to what the issue is. Anyone that can help fix this will be worshipped as a demi-god :smile: -- this has made our jobs very difficuly and has had us pulling our hair out for around 6 months now.
The mtu in the options.xlt2pd is passed to the Mac via pppd. It would be odd if that doesn't match and means that the Mac overrides this.
Note, for mac you should really migrate away from l2tp to native ipsec with XAUTH: https://libreswan.org/wiki/VPN_server_for_remote_clients_using_IKEv1_XAUTH
It would remove two layers of encapsulation and allow you to have normal mtu's
Paul
Sent from my iPhone
On Oct 2, 2014, at 17:56, Ross Timson notifications@github.com wrote:
@nbriozzo You are not alone with this one. At work myself and three of my colleagues have had this issue for many months, it started after upgrading to OS X Mavericks on our Macbook Pros (retina). There is one other member of our team who decided not to upgrade due to the issues we were having, he can still use the VPNs fine using Mountain Lion.
We have a lot of servers in AWS VPCs which can only be SSH'd to via their private IPs which needs us to first connect to an Openswan VPN. We have both Openswan and more recently Libreswan VPNs on both CentOS and Amazon Linux (RHEL based). Anything over around 140k is too large for us to scp of the boxes and commands that output a large amount often dumps us out with the corrupt MAC issue.
We have had to resort to pushing files onto S3 and then retrieving them from there or even shudder using a Windows VM using VMware Fusion which seems to work fine. At first we suspected Apple had broke the NIC driver but we've tested using both wifi and thunderbolt and also the Windows VM working confused us when we realised that still worked. One member of the team who is a bit more 'networky' tried lowering settings such as MTU and half-duplex on the Apple NIC and got some improvements but still unusable.
After becoming frustrated with this again recently I discovered this issue and decided to poke around at this again. During my latest test I connected to a Libreswan VPN and noticed that the MTU of ppp0 on my Macbook was set to 1280, I then noticed that the MTU on the actual VPN server was different, full output of both below:
Macbook
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384 options=3<RXCSUM,TXCSUM> inet6 ::1 prefixlen 128 inet 127.0.0.1 netmask 0xff000000 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1 nd6 options=1
gif0: flags=8010<POINTOPOINT,MULTICAST> mtu 1280 stf0: flags=0<> mtu 1280 en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500 ether 20:c9:d0:7d:05:f1 inet6 fe80::22c9:d0ff:fe7d:5f1%en0 prefixlen 64 scopeid 0x5 inet 192.168.1.7 netmask 0xffffff00 broadcast 192.168.1.255 nd6 options=1 media: autoselect status: active en3: flags=8963<UP,BROADCAST,SMART,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500 options=60<TSO4,TSO6> ether 32:00:12:5c:a5:a0 media: autoselect status: inactive en4: flags=8963<UP,BROADCAST,SMART,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500 options=60<TSO4,TSO6> ether 32:00:12:5c:a5:a1 media: autoselect status: inactive bridge0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500 options=63<RXCSUM,TXCSUM,TSO4,TSO6> ether aa:20:66:83:da:00 Configuration: id 0:0:0:0:0:0 priority 0 hellotime 0 fwddelay 0 maxage 0 holdcnt 0 proto stp maxaddr 100 timeout 1200 root id 0:0:0:0:0:0 priority 0 ifcost 0 port 0 ipfilter disabled flags 0x2 member: en3 flags=3<LEARNING,DISCOVER> ifmaxaddr 0 port 6 priority 0 path cost 0 member: en4 flags=3<LEARNING,DISCOVER> ifmaxaddr 0 port 7 priority 0 path cost 0 nd6 options=1 media: status: inactive p2p0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 2304 ether 02:c9:d0:7d:05:f1 media: autoselect status: inactive vboxnet0: flags=8842<BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500 ether 0a:00:27:00:00:00 vboxnet1: flags=8842<BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500 ether 0a:00:27:00:00:01 vboxnet2: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500 ether 0a:00:27:00:00:02 inet 192.168.59.3 netmask 0xffffff00 broadcast 192.168.59.255 ppp0: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> mtu 1280 inet 192.168.100.2 --> 192.168.100.1 netmask 0xffffff00 Libreswan Server (AWS / Amazon Linux) eth0 Link encap:Ethernet HWaddr 06:57:AF:B3:F3:40
inet addr:10.181.198.75 Bcast:10.181.198.127 Mask:255.255.255.128 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:210169 errors:0 dropped:0 overruns:0 frame:0 TX packets:112192 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:216465160 (206.4 MiB) TX bytes:39148411 (37.3 MiB) Interrupt:28lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:424 errors:0 dropped:0 overruns:0 frame:0 TX packets:424 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:63788 (62.2 KiB) TX bytes:63788 (62.2 KiB)ppp0 Link encap:Point-to-Point Protocol
inet addr:192.168.100.1 P-t-P:192.168.100.2 Mask:255.255.255.255 UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1410 Metric:1 RX packets:66 errors:0 dropped:0 overruns:0 frame:0 TX packets:44 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:3 RX bytes:6145 (6.0 KiB) TX bytes:6336 (6.1 KiB) I'm not sure where you change the MTU on ppp0 on the Macbook so I changed the Libreswan server to match by changing /etc/ppp/options.xl2tpd to:ipcp-accept-local ipcp-accept-remote ms-dns $VPN_DNSHOST noccp auth crtscts idle 1800 mtu 1280 mru 1280 nodefaultroute debug lock proxyarp connect-delay 5000 logfile /var/log/ppp.log After kicking the xl2ptd service I then managed to download a 100MB file from one of our VPC instances via the the VPN connection. I still saw a corrupt MAC error on the first attempt but as I mentioned before anything over 140k has been too big in the past and the 2nd attempt worked fine so this is a vast improvement and shows I'm on the right track (hopefully).
@letoams I have posted a gist of the script that sets up our VPNs, hopefully this shines some light on our exact setup and maybe we are doing something stupid: https://gist.github.com/rosstimson/09313902bedb6e5a7c7b
I'll test this some more when in the office tomorrow but hopefully this gives some more clues as to what the issue is. Anyone that can help fix this will be worshipped as a demi-god -- this has made our jobs very difficuly and has had us pulling our hair out for around 6 months now.
— Reply to this email directly or view it on GitHub.
Thank you very much Paul, removing two layers of encapsulation sounds promising so I've set up a little test environment to try this out. Using your wiki page https://libreswan.org/wiki/VPN_server_for_remote_clients_using_IKEv1_XAUTH and the Red Hat docs has got me to a stage where I can successfully connect with my Macbook using the Cisco IPSec VPN type.
Unfortunately as soon as I connect to the VPN I can no longer access anything.
My setup is as follows:
# /etc/ipsec.conf
config setup
protostack=netkey
nat_traversal=yes
virtual_private=%v4:10.0.0.0/8,%v4:192.168.0.0/16,%v4:172.16.0.0/12,%v4:25.0.0.0/8,%v4:100.64.0.0/10,%v6:fd00::/8,%v6:fe80::/10,%v4:!10.231.247.0/24
conn xauth-psk
authby=secret
pfs=no
auto=add
# Amazon does not route ESP/AH packets, so these must be encapsulated in UDP
forceencaps=yes
rekey=no
left=10.0.0.22 # PRIVATE_IP
# set our ID to your (static) elastic IP
leftid=$PUBLIC_IP
# set the desired source IP to the Elastic IP. Libreswan will create interface address and route.
# Configure the elastic IP on loopback, eg: ip addr add $PUBLIC_IP/32 dev lo
leftsourceip=$PUBLIC_IP
rightaddresspool=10.231.247.1-10.231.247.254
right=%any
modecfgdns1=10.0.0.2
modecfgdns2=8.8.8.8
leftxauthserver=yes
rightxauthclient=yes
leftmodecfgserver=yes
rightmodecfgclient=yes
modecfgpull=yes
xauthby=file
ike-frag=yes
dpddelay=30
dpdtimeout=120
dpdaction=clear
include /etc/ipsec.d/*.conf
# /etc/ipsec.d/passwd
myuser:somecryptedpassword:xauth-psk
# /etc/ipsec.d/xauth-psk.secrets
$PUBLIC_IP %any : PSK "somelongpsk"
[root@ip-10-0-0-22 etc]# sysctl -p
net.ipv4.ip_forward = 1
net.ipv4.conf.default.rp_filter = 0
net.ipv4.conf.all.rp_filter = 0
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.default.send_redirects = 0
net.ipv4.icmp_ignore_bogus_error_responses = 1
net.ipv4.conf.default.log_martians = 0
net.ipv4.conf.all.log_martians = 0
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.default.accept_redirects = 0
net.ipv4.neigh.default.gc_thresh1 = 1024
net.ipv4.neigh.default.gc_thresh2 = 2048
net.ipv4.neigh.default.gc_thresh3 = 4096
On my current (buggy) L2TP over IPSec VPN when I connect I see the following routes (extract):
Internet:
Destination Gateway Flags Refs Use Netif Expire
default 192.168.1.1 UGSc 24 0 en0
default 192.168.100.1 UGScI 0 0 ppp0
10.181.128/17 192.168.100.1 UGSc 1 0 ppp0
10.183.128/17 192.168.100.1 UGSc 0 0 ppp0
Those 10.181.128/17
are private VPCs in AWS that I can only connect to via the VPN.
These are added with:
#!/bin/sh
# /etc/ppp/ip-up
if [ "${5:-}" = "192.168.100.1" ]
then
/sbin/route add 10.181.128.0/17 $5
/sbin/route add 10.183.128.0/17 $5
fi
Where 192.168.100.1
is the VPN side and 192.168.100.2
is my laptop side.
ppp0: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> mtu 1280
inet 192.168.100.2 --> 192.168.100.1 netmask 0xffffff00
At the moment I'm not sure how I achieve the same sort of thing now that I've removed the L2TP stuff. I've tried manually adding routes etc but nothing seems to work. When connected to the new VPN I have:
utun0: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> mtu 1280
inet 10.231.247.1 --> 10.231.247.1 netmask 0xffffffff
And my route table (extract):
Internet:
Destination Gateway Flags Refs Use Netif Expire
default utun0 UCS 4 0 utun0
default 192.168.1.1 UGScI 0 0 en0
8.8.8.8 utun0 UHW3I 0 245 utun0 9
10.0.0.2 utun0 UHW3I 0 213 utun0 10
10.231.247.1 10.231.247.1 UH 1 51 utun0
17.172.238.40 utun0 UHWIi 1 11 utun0
54.171.70.111 192.168.1.1 UGHS 4 35 en0
127 127.0.0.1 UCS 0 0 lo0
127.0.0.1 127.0.0.1 UH 1 259500 lo0
169.254 link#5 UCS 0 0 en0
173.194.112.18 utun0 UHWIi 1 11 utun0
% route -n get google.com
route: bad address: google.com
% route get 10.0.1.97 <------ Another EC2 instance in VPC private subnet.
route to: 10.0.1.97
destination: default
mask: default
interface: utun0
flags: <UP,DONE,CLONING,STATIC>
recvpipe sendpipe ssthresh rtt,msec rttvar hopcount mtu expire
0 0 0 0 0 0 1280 0
Any ideas on how I route traffic through the new VPN? Also, on the VPN server I've tried many different iptables rules including these from the wiki?
iptables -t nat -I POSTROUTING -s 10.231.247.0/24 -d 10.231.246.0/24 -j RETURN
iptables -t nat -A POSTROUTING -s 10.231.247.0/24 -d 0.0.0.0/8 -j MASQUERADE
Am I possibly missing iptables rules that are necessary to make this work?
In case it makes a difference, everything I'm doing is in AWS. What I'm trying to achieve is a Libreswan VPN in a public subnet and many instances that are only accessible by their private IPs whilst connected to the VPN in a private subnet. I currently have this working but only with L2TP over IPSec using xl2tpd
in front of Libreswan. This is the current setup that has major trouble with the corrupt MAC on input
issue and no amount of tweaking MTU etc fixes this. Removing the extra service/layer sounds like it might do the trick if I could work out how to do it.
@letoams - Apologies for the long post and for the n00b like questions, many of the documentation and tutorials I can find are outdated or only deal with Openswan, most seem to be for permanent/static VPN tunnels not road warrior setups and I couldn't find much at all about getting this all working on AWS. Virtually everywhere link I found had you helping people, you are a credit to the open source community and any pointers/advice would be very graciously received.
@nbriozzo I hope if I manage to get this working that it will solve your issue too, it sounds like our setups are very similar.
On Sat, 4 Oct 2014, Ross Timson wrote:
Thank you very much Paul, removing two layers of encapsulation sounds promising so I've set up a little test environment to try this out. Using your wiki page https://libreswan.org/wiki/VPN_server_for_remote_clients_using_IKEv1_XAUTH and the Red Hat docs has got me to a stage where I can successfully connect with my Macbook using the Cisco IPSec VPN type.
If you let me know what was missing on the libreswan wiki page, I'll work on improving it.
Unfortunately as soon as I connect to the VPN I can no longer access anything.
(it might make sense to move this discussion to swan@lists.libreswan.org)
conn xauth-psk authby=secret pfs=no auto=add
Amazon does not route ESP/AH packets, so these must be encapsulated in UDP
forceencaps=yes rekey=no left=10.0.0.22 # PRIVATE_IP
set our ID to your (static) elastic IP
leftid=$PUBLIC_IP
set the desired source IP to the Elastic IP. Libreswan will create interface address and route.
Configure the elastic IP on loopback, eg: ip addr add $PUBLIC_IP/32 dev lo
leftsourceip=$PUBLIC_IP
Did you configure $PUBLIC_IP as an alias on your eth0 or lo interface?
On my current (buggy) L2TP over IPSec VPN when I connect I see the following routes (extract):
Internet: Destination Gateway Flags Refs Use Netif Expire default 192.168.1.1 UGSc 24 0 en0 default 192.168.100.1 UGScI 0 0 ppp0 10.181.128/17 192.168.100.1 UGSc 1 0 ppp0 10.183.128/17 192.168.100.1 UGSc 0 0 ppp0
Those 10.181.128/17 are private VPCs in AWS that I can only connect to via the VPN.
These are added with:
!/bin/sh
/etc/ppp/ip-up
if [ "${5:-}" = "192.168.100.1" ] then /sbin/route add 10.181.128.0/17 $5 /sbin/route add 10.183.128.0/17 $5 fi
Where 192.168.100.1 is the VPN side and 192.168.100.2 is my laptop side.
That should not be needed if the VPN server NAT's the IP's from your IP pool and/or can route those IPs to those VPCs.
Paul
It seems we are having the same issue, i have tried to replicate the VPN connection using libreswan but i can no longer access to anything via ssh/telnet. I'll check if there's some misconfiguration.
Hi @nbriozzo - Just a quick update on this as a colleague of mine (@sage-andrew-taylor) managed to fix our corrupt MAC issue simply by changing the EC2 instance type (AMI) over to HVM, we had previously been running on PV instances. This immediately fixed our issue with no config changes at all.
It would therefore seem like this is not in any way a bug with Libreswan or our configs, if you are using PV instances please give HVM a go and report back so that @letoams can close this issue.
Hope this works for you.
@rosstimson Hi, how do you fix this?
Unfortunately as soon as I connect to the VPN I can no longer access anything.
I had same issue, similar route table. I can only access VPN. No internet, not be able to reach other server via private IP no matter what kind of NATting iptables...
@kureikain Sorry for the late response. We never got the native ipsec working on AWS. If I remember rightly I think I assumed this was because of the networking restrictions in AWS. We reverted back to using xl2tpd as the corrupt MAC issue doesn't exist once the instance type has been changed to HVM.
So the issue is the following: I have several Ubuntu servers running on a AWS environment and i've installed one with OpenSwan so other users may login into them via VPN (using ssh from their laptops, most of them are Airbooks)
What is happening is that while they are logged via VPN and run a command that has a lot of output (like a find /), the screen will eventually show this message and break the ssh connection:
Corrupted MAC on input. Disconnecting: Packet corrupt
This is not totally preventing them to do their jobs, but it is fairly annoying. As an additional data, i have logged in into the servers through an external/elastic IP instead of the VPN and i don't have this issue (also, the connection/response seems faster) so it would see to be that the VPN is somewhat responsible.
Please let me know how should i address this problem and if you need further information to debug.
Regards