Ernillew / wl500g

Automatically exported from code.google.com/p/wl500g
0 stars 0 forks source link

IPSec Packet has no Non-ESP marker #358

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1.download http://ftp.openswan.org/openswan/openswan-2.6.38.tar.gz,and port it 
to WL-500GPv2
2.compile the kernel with netkey and crypto be selected
3.run pluto as next steps:
   a)pluto --nat_traversal
   b)whack --name vpn_ipsec0_ --ipv4 --host 192.168.3.184  --client      172.16.0.0/24 --srcip 192.168.3.184  --updown /usr/sbin/updown --to --host xxx.xxx.xxx.xxx --client 10.10.10.0/24 --srcip xxx.xxx.xxx.xxx --updown /usr/sbin/updown --dpddelay 30 --dpdtimeout 120 --psk --tunnel --encrypt --pfs --pfsgroup modp1024 --ikelifetime 28800 --ipseclifetime 3600 --esp 3des-md5 --ike aes128-sha1-modp1024
  (xxx.xxx.xxx.xxx is another IPSEC gateway which based on Centos 6.3 ,works normally)

   c)whack --listen
   d)whack --initiate --name vpn_ipsec0_ --asynchronous

What is the expected output? What do you see instead?

1. If NO NAT ,the vpn tunnels establish successfully and work properly.(pings 
success from both VPN subnet )

2. When router behind  NAT,the vpn tunnel is up and running ,both phases 1 and 
2 are ok ,But when pings  I get the following error :
    Oct  8 04:53:33 pluto[710]: packet from xxx.xxx.xxx.xxx:4500: recvfrom xxx.xxx.xxx.xxx:4500 has no Non-ESP marker
    Oct  8 04:53:38 pluto[710]: packet from xxx.xxx.xxx.xxx:4500: recvfrom xxx.xxx.xxx.xxx:4500 has no Non-ESP marker
    Oct  8 04:53:43 pluto[710]: packet from xxx.xxx.xxx.xxx:4500: recvfrom xxx.xxx.xxx.xxx:4500 has no Non-ESP marker
    Oct  8 04:53:48 pluto[710]: packet from xxx.xxx.xxx.xxx:4500: recvfrom xxx.xxx.xxx.xxx:4500 has no Non-ESP marker

What version of the product are you using?
1. WL-500GPv2
2. 1.9.2.7-rtn-r4258.trx

Please provide any additional information below.

There's some message may be usefull:

# ip xfrm monitor
Async event  (0x10)  replay update 
        src 192.168.3.184 dst xxx.xxx.xxx.xxx  reqid 0x4001 protocol ipv6-crypt  SPI 0xaa36066f
Async event  (0x20)  timer expired 
        src 192.168.3.184 dst xxx.xxx.xxx.xxx  reqid 0x4001 protocol ipv6-crypt  SPI 0xaa36066f
Async event  (0x20)  timer expired 
        src 192.168.3.184 dst xxx.xxx.xxx.xxx  reqid 0x4001 protocol ipv6-crypt  SPI 0xaa36066f
Async event  (0x20)  timer expired 

  # ip xfrm state
src xxx.xxx.xxx.xxx dst 192.168.3.184
        proto comp spi 0x00003ca9 reqid 16386 mode tunnel
        replay-window 0 flag 20
        comp deflate 0x
src 192.168.3.184 dst xxx.xxx.xxx.xxx
        proto comp spi 0x00000ee8 reqid 16386 mode tunnel
        replay-window 0 flag 20
        comp deflate 0x
src xxx.xxx.xxx.xxx dst 192.168.3.184
        proto esp spi 0x63742067 reqid 16385 mode transport
        replay-window 32 
        auth hmac(sha1) 0xe2e73ed95eb00f2bf5cef82a86285818c6644643
        enc cbc(aes) 0x0c08ffad11e7162e3533ae4a8746a490
        encap type espinudp sport 4500 dport 4500 addr 0.0.0.0
src 192.168.3.184 dst xxx.xxx.xxx.xxx
        proto esp spi 0xb9185004 reqid 16385 mode transport
        replay-window 32 
        auth hmac(sha1) 0x6bcf3214af38cac4c9a0e0e7f311b0e3b0b5daee
        enc cbc(aes) 0xf922e36ed98342f0e11a9c7720affb6e
        encap type espinudp sport 4500 dport 4500 addr 0.0.0.0
src xxx.xxx.xxx.xxx dst 192.168.3.184
        proto 4 spi 0x3ad7035a reqid 0 mode tunnel
        replay-window 0 flag 20
src 192.168.3.184 dst xxx.xxx.xxx.xxx
        proto 4 spi 0xc0a803b8 reqid 0 mode tunnel
        replay-window 0 flag 20

iptables rules is all ACCEPT   

#tcpdump -s0 -vvvv -ni vlan1 port 4500
tcpdump: listening on vlan1, link-type EN10MB (Ethernet), capture size 65535 
bytes
05:57:15.596385 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP 
(17), length 29)
    192.168.3.184.4500 > xxx.xxx.xxx.xxx.4500: [udp sum ok] UDP, length 1
05:57:15.596731 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP 
(17), length 29)
    192.168.3.184.4500 > xxx.xxx.xxx.xxx.4500: [udp sum ok] UDP, length 1
05:57:22.041338 IP (tos 0x0, ttl 64, id 42363, offset 0, flags [none], proto 
UDP (17), length 128)
    192.168.3.184.4500 > xxx.xxx.xxx.xxx.4500: [no cksum] UDP, length 100
05:57:22.042502 IP (tos 0x0, ttl 62, id 10879, offset 0, flags [none], proto 
UDP (17), length 128)
    xxx.xxx.xxx.xxx.4500 > 192.168.3.184.4500: [no cksum] UDP, length 100
05:57:27.037142 IP (tos 0x0, ttl 64, id 42364, offset 0, flags [none], proto 
UDP (17), length 128)
    192.168.3.184.4500 > xxx.xxx.xxx.xxx.4500: [no cksum] UDP, length 100
05:57:27.038115 IP (tos 0x0, ttl 62, id 10880, offset 0, flags [none], proto 
UDP (17), length 128)
    xxx.xxx.xxx.xxx.4500 > 192.168.3.184.4500: [no cksum] UDP, length 100
05:57:32.044309 IP (tos 0x0, ttl 64, id 42365, offset 0, flags [none], proto 
UDP (17), length 128)
    192.168.3.184.4500 > xxx.xxx.xxx.xxx.4500: [no cksum] UDP, length 100
05:57:32.045289 IP (tos 0x0, ttl 62, id 10881, offset 0, flags [none], proto 
UDP (17), length 128)
    xxx.xxx.xxx.xxx.4500 > 192.168.3.184.4500: [no cksum] UDP, length 100
05:57:35.053499 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP 
(17), length 29)
    192.168.3.184.4500 > xxx.xxx.xxx.xxx.4500: [udp sum ok] UDP, length 1
05:57:35.053789 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP 
(17), length 29)
    192.168.3.184.4500 > xxx.xxx.xxx.xxx.4500: [udp sum ok] UDP, length 1
05:57:35.055293 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP 
(17), length 124)
    192.168.3.184.4500 > xxx.xxx.xxx.xxx.4500: [udp sum ok] UDP, length 96
05:57:35.061424 IP (tos 0x0, ttl 62, id 0, offset 0, flags [DF], proto UDP 
(17), length 124)

# ip xfrm policy
src 10.10.10.0/24 dst 172.16.0.0/24 
        dir in priority 2344 ptype main 
        tmpl src xxx.xxx.xxx.xxx dst 192.168.3.184
                proto comp reqid 16386 mode tunnel
                level use 
        tmpl src 0.0.0.0 dst 0.0.0.0
                proto esp reqid 16385 mode transport
src 172.16.0.0/24 dst 10.10.10.0/24 
        dir out priority 2344 ptype main 
        tmpl src 192.168.3.184 dst xxx.xxx.xxx.xxx
                proto comp reqid 16386 mode tunnel
        tmpl src 0.0.0.0 dst 0.0.0.0
                proto esp reqid 16385 mode transport
src 10.10.10.0/24 dst 172.16.0.0/24 
        dir fwd priority 2344 ptype main 
        tmpl src xxx.xxx.xxx.xxx dst 192.168.3.184
                proto comp reqid 16386 mode tunnel
                level use 
        tmpl src 0.0.0.0 dst 0.0.0.0
                proto esp reqid 16385 mode transport

Original issue reported on code.google.com by cometjk3...@gmail.com on 8 Oct 2012 at 6:10

GoogleCodeExporter commented 9 years ago
First of all, thank you for efforts. Unfortunately, none of us familiar with 
IPSEC & OpenSwan.

Please provide your kernel config difference against our trunk. Did you modify 
kernel code? Can you move on latest trunk revisions (r4667 & up)?

Messages "packet XXX has no Non-ESP marker" issued by Openswan, so problem 
might be in NAT configuration itself. Please clarify "When router behind NAT" - 
does it means that NAT running on some other host? If yes, it might be problem 
in netfilter rules on it, take a look at:
http://www.archivum.info/users@openswan.org/2006-11/00336/%28Openswan-Users%29-P
acket-has-no-Non-ESP-marker.html
http://comments.gmane.org/gmane.network.openswan.user/16617

Original comment by lly.dev on 8 Oct 2012 at 10:10

GoogleCodeExporter commented 9 years ago
Thanks for your reply.

I used a Non-modified kernel code,which config is attached below.
As to "behind NAT" ,it means :
when the structure of  network is like this:

    subnet1 ---- WL-500GPv2 <----->MyFirewall(NAT)----- CentOs_IPsecVPN ---- subnet2
  172.16.0.2     IP1     IP2        IP3     IP4         IP5         IP6    10.10.10.2   

IP1: 172.16.0.1
IP2: 192.168.3.184
IP3: 192.168.3.1
IP4: public IPA   
IP5: public IPB(xxx.xxx.xxx.xxx)   
IP6: 10.10.10.1

Message  "packet XXX has no Non-ESP marker" will turn up.But I replace 
WL-500GPv2 with a Fedora14 PC which has been configured as a IPsec Gateway 
,everything is OK.
By the way,the IPSec tunnel used pre-shared key.

The Links you mentioned, has a Similar situation indeed. It said  "The problem 
is a well known problem in kernel ipsec that is triggered when using
e1000 driver and ipsec. It has been corrected in 2.6.19 " . However,our kernel 
has baesd on 2.6.22.

Original comment by cometjk3...@gmail.com on 9 Oct 2012 at 7:54

Attachments:

GoogleCodeExporter commented 9 years ago
About kernel config - for which purposes you turn on CONFIG_IP_PNP & 
CONFIG_BLK_DEV_NBD ?

About "packet XXX has no Non-ESP marker" message - seems to be it is well known 
problem for Double NAT configuration. But it splits to:
1) netfilter rules config (MTU problem)
2) kernel IKE frames fragmentation problem

Can you test various ping packet sizes? Best of all is to compare tcpdump 
captured packets in case of WL-500GPv2(wrong) & Fedora14 PC(good?).

Again - can you move on latest trunk revisions (r4667 & up)?

Original comment by lly.dev on 9 Oct 2012 at 12:18

GoogleCodeExporter commented 9 years ago
CONFIG_IP_PNP & CONFIG_BLK_DEV_NBD is used for rootnfs ,I forget to turn 
off.However,It doesn't work properly without that.

Based on the latest trunk revisions(r4672), I have tested pings with various 
packet sizes from 1 to 1600 bytes.(MTU is always 1500.) But the result seems 
similar.And I capture the output of tcpdump in Fedora 14 PC (It works well.)

Kernel config and tcpdump log  is attached bellow.The ESP UDP packets length 
looks different although the same size packets pings .wl500gpv2 is 120  while  
fedora14 128,the difference is always multiple 8 bytes.

Original comment by cometjk3...@gmail.com on 10 Oct 2012 at 8:18

Attachments:

GoogleCodeExporter commented 9 years ago
Thanks for information, we will try find problem in our(Broadcom) ancient 2.6.22

Original comment by lly.dev on 10 Oct 2012 at 5:34

GoogleCodeExporter commented 9 years ago
try to switch bcm nat accelerator off
nvram set misc_fastnat_x=0 && nvram commit && reboot

Original comment by themiron.ru on 11 Oct 2012 at 10:46

GoogleCodeExporter commented 9 years ago
I switch bcm_nat off as you said,but problem remains. 

Original comment by cometjk3...@gmail.com on 12 Oct 2012 at 1:22

GoogleCodeExporter commented 9 years ago
Please try xfrm backports 2.6.24 patch 
http://wl500g.googlecode.com/files/xfrm.patch.gz against r4686

Original comment by lly.dev on 21 Oct 2012 at 4:51

GoogleCodeExporter commented 9 years ago
According to your suggestion,I'm getting the following error in pluto's logs:

/ # whack --initiate --name vpn_ipsec0_
002 "vpn_ipsec0_" #1: initiating Main Mode
104 "vpn_ipsec0_" #1: STATE_MAIN_I1: initiate
003 "vpn_ipsec0_" #1: ignoring unknown Vendor ID payload 
[4f4568794c64414365636661]
003 "vpn_ipsec0_" #1: received Vendor ID payload [Dead Peer Detection]
003 "vpn_ipsec0_" #1: received Vendor ID payload [RFC 3947] method set to=115 
002 "vpn_ipsec0_" #1: enabling possible NAT-traversal with method RFC 3947 
(NAT-Traversal)
002 "vpn_ipsec0_" #1: transition from state STATE_MAIN_I1 to state STATE_MAIN_I2
106 "vpn_ipsec0_" #1: STATE_MAIN_I2: sent MI2, expecting MR2
003 "vpn_ipsec0_" #1: NAT-Traversal: Result using draft-ietf-ipsec-nat-t-ike 
(MacOS X): i am NATed
002 "vpn_ipsec0_" #1: transition from state STATE_MAIN_I2 to state STATE_MAIN_I3
108 "vpn_ipsec0_" #1: STATE_MAIN_I3: sent MI3, expecting MR3
003 "vpn_ipsec0_" #1: received Vendor ID payload [CAN-IKEv2]
002 "vpn_ipsec0_" #1: Main mode peer ID is ID_IPV4_ADDR: 'xxx.xxx.xxx.xxx'
002 "vpn_ipsec0_" #1: transition from state STATE_MAIN_I3 to state STATE_MAIN_I4
004 "vpn_ipsec0_" #1: STATE_MAIN_I4: ISAKMP SA established 
{auth=OAKLEY_PRESHARED_KEY cipher=aes_128 prf=oakley_sha group=modp1024}
002 "vpn_ipsec0_" #1: Dead Peer Detection (RFC 3706): enabled
002 "vpn_ipsec0_" #2: initiating Quick Mode PSK+ENCRYPT+TUNNEL+PFS+UP {using 
isakmp#1 msgid:2c2b09c4 proposal=3DES(3)_192-MD5(1)_128 
pfsgroup=OAKLEY_GROUP_MODP1024}
117 "vpn_ipsec0_" #2: STATE_QUICK_I1: initiate
003 "vpn_ipsec0_" #2: ERROR: netlink response for Add SA 
esp.f829b00e@xxx.xxx.xxx.xxx included errno 22: Invalid argument
032 "vpn_ipsec0_" #2: STATE_QUICK_I1: internal error

003 "vpn_ipsec0_" #2: ERROR: netlink response for Add SA 
esp.f829b00e@xxx.xxx.xxx.xxx included errno 22: Invalid argument
032 "vpn_ipsec0_" #2: STATE_QUICK_I1: internal error

Original comment by cometjk3...@gmail.com on 22 Oct 2012 at 4:06

GoogleCodeExporter commented 9 years ago
Thank you for test, I have to check it myself.

Original comment by lly.dev on 22 Oct 2012 at 7:22

GoogleCodeExporter commented 9 years ago
Sorry for late reply ,The truck code is correct.Problem is my openswan 
transplantion.
When openswan 2.6.38 USE_NETKEY,in pluto/nat_traversal.c ,use eth0 as default 
ethernet device.Howerver ,the router is vlan1.

Original comment by cometjk3...@gmail.com on 9 Jan 2013 at 8:14

GoogleCodeExporter commented 9 years ago
Thanks. So, problem with Non-ESP marker is inside vlan code.

Should we close the issue?

Original comment by lly.dev on 9 Jan 2013 at 12:25

GoogleCodeExporter commented 9 years ago
Yes,I change eth0 to vlan1 ,then It works well.
Thanks again for all your help.

Original comment by cometjk3...@gmail.com on 9 Jan 2013 at 12:48

GoogleCodeExporter commented 9 years ago

Original comment by themiron.ru on 27 Apr 2013 at 8:14