sapcc / asr1k-neutron-l3

Cisco ASR 1000 Neutron L3 driver
Apache License 2.0
4 stars 1 forks source link

FWaaS ICMP echo requests issues on long running probes #45

Open swagner-de opened 3 years ago

swagner-de commented 3 years ago

TAC

TAC Case 691180559 open. Mentioned files can be found attached to the case.

Setup

Loacal Machine: fip-agt02-rtr01-priv01 Native IP: 10.180.5.10 NATed IP: 10.237.208.9 Gateway: qa-de-1-rt12a VRF: 7e2a161a85fd454f8838bc4de9563333 Ingress L3 Interface: BD-VIF6856 Egress L3 Interface: BD-VIF7005 remote Machine/target: 10.237.208.58, also NATed, native IP: 10.180.12.13

We run continious ICMP echo-requests using the blackbox_exporter. It works by sending ICMP echo-request probes with a static (over the processes runtime) ID and increasing sequence numbers to a target. The target is a VM behind a different router.

The target/remote machine runs the same sort of probes against this machine.

Problem

We use the blackbox_exporter which sends one ICMP echo request with ID=X and sequence=Y, then sleeps for some time and then sends another one with ID=X, sequence=Y + random(). Looking through RFC792 I believe this is valid behaviour.

The firewall feature on the ASR however is not willing to (re)create this session, hence the packet is dropped. However, any new echo-request should create a session for a echo-reply to come back.

ASR Config

qa-de-1-rt12a#show run vrf 7e2a161a85fd454f8838bc4de9563333
Building configuration...

Current configuration : 1894 bytes
vrf definition 7e2a161a85fd454f8838bc4de9563333
 description Router 7e2a161a-85fd-454f-8838-bc4de9563333
 rd 65148:34352
 !
 address-family ipv4
  export map exp-7e2a161a85fd454f8838bc4de9563333
 exit-address-family
!
!
interface BD-VIF6856
 description 7e2a161a-85fd-454f-8838-bc4de9563333
 mac-address fa16.3e41.fcd8
 mtu 8950
 vrf forwarding 7e2a161a85fd454f8838bc4de9563333
 ip address 10.180.5.1 255.255.255.0
 ip nat stick
 zone-member security ZN-FWAAS-7e2a161a85fd454f8838bc4de9563333-in
 ip policy route-map pbr-7e2a161a85fd454f8838bc4de9563333
!
interface BD-VIF7005
 description 7e2a161a-85fd-454f-8838-bc4de9563333
 mac-address fa16.3e4c.ab0e
 mtu 8950
 vrf forwarding 7e2a161a85fd454f8838bc4de9563333
 ip address 10.237.208.24 255.255.255.0
 ip nat outside
 ip access-group EXT-TOS out
 zone-member security ZN-FWAAS-7e2a161a85fd454f8838bc4de9563333-out
 ip policy route-map EXT-TOS
!
interface BD-VIF7154
 description 7e2a161a-85fd-454f-8838-bc4de9563333
 mac-address fa16.3ead.1e11
 mtu 8950
 vrf forwarding 7e2a161a85fd454f8838bc4de9563333
 ip address 10.180.6.1 255.255.255.0
 ip nat stick
 zone-member security ZN-FWAAS-7e2a161a85fd454f8838bc4de9563333-in
 ip policy route-map pbr-7e2a161a85fd454f8838bc4de9563333
ip nat inside source static 10.180.5.10 10.237.208.9 vrf 7e2a161a85fd454f8838bc4de9563333 redundancy 1 mapping-id 621668109 match-in-vrf
ip nat inside source static 10.180.6.5 10.237.208.35 vrf 7e2a161a85fd454f8838bc4de9563333 redundancy 1 mapping-id 56444491 match-in-vrf
ip nat inside source list NAT-7e2a161a85fd454f8838bc4de9563333 interface BD-VIF7005 vrf 7e2a161a85fd454f8838bc4de9563333 overload
!
ip route vrf 7e2a161a85fd454f8838bc4de9563333 0.0.0.0 0.0.0.0 10.237.208.1
ip route vrf 7e2a161a85fd454f8838bc4de9563333 10.180.64.5 255.255.255.255 10.180.5.11
ip route vrf 7e2a161a85fd454f8838bc4de9563333 10.180.64.6 255.255.255.255 10.180.6.9
end

Firewall Config

qa-de-1-rt12a#show zone-pair security
Zone-pair name ZP-FWAAS-7e2a161a85fd454f8838bc4de9563333-IN2OUT 1
    Source-Zone ZN-FWAAS-7e2a161a85fd454f8838bc4de9563333-in  Destination-Zone ZN-FWAAS-7e2a161a85fd454f8838bc4de9563333-out
    service-policy PM-FWAAS-7e2a161a85fd454f8838bc4de9563333-IN2OUT
Zone-pair name ZP-FWAAS-7e2a161a85fd454f8838bc4de9563333-OUT2IN 2
    Source-Zone ZN-FWAAS-7e2a161a85fd454f8838bc4de9563333-out  Destination-Zone ZN-FWAAS-7e2a161a85fd454f8838bc4de9563333-in
    service-policy PM-FWAAS-7e2a161a85fd454f8838bc4de9563333-OUT2IN

qa-de-1-rt12a#show policy-map type inspect
  Policy Map type inspect PM-FWAAS-7e2a161a85fd454f8838bc4de9563333-IN2OUT
    Class CM-FWAAS-7e2a161a85fd454f8838bc4de9563333-IN2OUT
      Inspect
    Class class-default
      Drop log

  Policy Map type inspect PM-FWAAS-7e2a161a85fd454f8838bc4de9563333-OUT2IN
    Class CM-FWAAS-7e2a161a85fd454f8838bc4de9563333-OUT2IN
      Inspect
    Class class-default
      Drop log

qa-de-1-rt12a#show class-map type inspect
 Class Map type inspect match-all CM-FWAAS-7e2a161a85fd454f8838bc4de9563333-OUT2IN (id 1)
   Match access-group name ACL-FWAAS-7e2a161a85fd454f8838bc4de9563333-OUT2IN

 Class Map type inspect match-all CM-FWAAS-7e2a161a85fd454f8838bc4de9563333-IN2OUT (id 2)
   Match access-group name ACL-FWAAS-7e2a161a85fd454f8838bc4de9563333-IN2OUT

qa-de-1-rt12a#show ip access-lists ACL-FWAAS-7e2a161a85fd454f8838bc4de9563333-IN2OUT
Extended IP access list ACL-FWAAS-7e2a161a85fd454f8838bc4de9563333-IN2OUT
    10 permit ip any any
qa-de-1-rt12a#show ip access-lists ACL-FWAAS-7e2a161a85fd454f8838bc4de9563333-OUT2IN
Extended IP access list ACL-FWAAS-7e2a161a85fd454f8838bc4de9563333-OUT2IN
    10 permit tcp any any eq www
    20 permit udp any any eq domain
    30 permit icmp any any echo

Analysis

ccloud@fip-agt02-rtr01-priv01:~$ sudo tcpdump -n -i ens192 -v icmp and host 10.237.208.58 and host 10.180.5.10
tcpdump: listening on ens192, link-type EN10MB (Ethernet), capture size 262144 bytes
15:30:11.751396 IP (tos 0x0, ttl 63, id 3585, offset 0, flags [DF], proto ICMP (1), length 56)
    10.180.5.10 > 10.237.208.58: ICMP echo request, id 1413, seq 59358, length 36
15:30:15.928660 IP (tos 0xa0, ttl 61, id 18451, offset 0, flags [DF], proto ICMP (1), length 56)
    10.237.208.58 > 10.180.5.10: ICMP echo request, id 22836, seq 53862, length 36
15:30:15.928675 IP (tos 0xa0, ttl 64, id 5882, offset 0, flags [none], proto ICMP (1), length 56)
    10.180.5.10 > 10.237.208.58: ICMP echo reply, id 22836, seq 53862, length 36
15:30:21.751421 IP (tos 0x0, ttl 63, id 5694, offset 0, flags [DF], proto ICMP (1), length 56)
    10.180.5.10 > 10.237.208.58: ICMP echo request, id 1413, seq 59411, length 36
15:30:25.928877 IP (tos 0xa0, ttl 61, id 20096, offset 0, flags [DF], proto ICMP (1), length 56)
    10.237.208.58 > 10.180.5.10: ICMP echo request, id 22836, seq 53915, length 36
15:30:25.928897 IP (tos 0xa0, ttl 64, id 6002, offset 0, flags [none], proto ICMP (1), length 56)
    10.180.5.10 > 10.237.208.58: ICMP echo reply, id 22836, seq 53915, length 36
15:30:31.751364 IP (tos 0x0, ttl 63, id 7488, offset 0, flags [DF], proto ICMP (1), length 56)
    10.180.5.10 > 10.237.208.58: ICMP echo request, id 1413, seq 59464, length 36
15:30:35.929047 IP (tos 0xa0, ttl 61, id 21158, offset 0, flags [DF], proto ICMP (1), length 56)
    10.237.208.58 > 10.180.5.10: ICMP echo request, id 22836, seq 53968, length 36
15:30:35.929064 IP (tos 0xa0, ttl 64, id 6741, offset 0, flags [none], proto ICMP (1), length 56)
    10.180.5.10 > 10.237.208.58: ICMP echo reply, id 22836, seq 53968, length 36
^C
9 packets captured
9 packets received by filter
0 packets dropped by kernel
ccloud@fip-agt02-rtr01-priv01:~$

As you can see the remote machines echo-requests (id 22836) are getting responded by us, yet the local machines echo request (id 1413) never gets a response. However, I can confirm that the echo requests from us are reaching the target and are being replied by it. See below excerpt from target (remember, the target is also behind NAT):

ccloud@fip-agt03-rtr02-priv02:~$ sudo tcpdump -n -i ens192 -v icmp and host 10.180.12.13 and host 10.237.208.9
tcpdump: listening on ens192, link-type EN10MB (Ethernet), capture size 262144 bytes
16:02:15.930305 IP (tos 0x0, ttl 63, id 7018, offset 0, flags [DF], proto ICMP (1), length 56)
    10.180.12.13 > 10.237.208.9: ICMP echo request, id 22836, seq 64038, length 36
16:02:15.931041 IP (tos 0xa0, ttl 62, id 45855, offset 0, flags [none], proto ICMP (1), length 56)
    10.237.208.9 > 10.180.12.13: ICMP echo reply, id 22836, seq 64038, length 36
16:02:21.754066 IP (tos 0xa0, ttl 61, id 50419, offset 0, flags [DF], proto ICMP (1), length 56)
    10.237.208.9 > 10.180.12.13: ICMP echo request, id 1413, seq 4051, length 36
16:02:21.754083 IP (tos 0xa0, ttl 64, id 7417, offset 0, flags [none], proto ICMP (1), length 56)
    10.180.12.13 > 10.237.208.9: ICMP echo reply, id 1413, seq 4051, length 36

On the router, I cannot see a session in the flow-db:

qa-de-1-rt12a# show platform hardware qfp active feature firewall datapath scb 10.180.5.10 any 10.237.208.58 any 1 all any detail

[s=session  i=imprecise channel c=control channel  d=data channel A/D=appfw action allow/deny]

qa-de-1-rt12a#

I can however see a entry in the flow-db for the request that is originated by the target machine, identified by the id 22836:

qa-de-1-rt12a# show platform hardware qfp active feature firewall datapath scb 10.237.208.58 any 10.180.5.10 any 1 all any detail
Session ID:0x001F5F19 10.237.208.58 8 10.180.5.10 22836 proto 1 (7e2a161a85fd454f8838bc4de9563333:68:7e2a161a85fd454f8838bc4de9563333:68) (0x3:icmp)    [sc]
 pscb : 0x861e840,  key1_flags: 0x00000000
    bucket : 65735, prev 0x0, next 0x0
    fw_flags: 0x00000004 0x2041ab61,
     VRF1-rsrc-limit
     Root Protocol-ICMP NAT-applied Initiator Alert Proto-State:Established No-halfopen-list Active-cnt Session-db Max-session
    icmp_error count 0 ureachable arrived: no
    scb state: active, nxt_timeout: 1000, refcnt: 1
    ha nak cnt: 0,  rg: 1
    hostdb: 0x00000000, L7: 0x, stats: 0x54f6b800, child: 0x00000000
    l4blk0: 0x00000000 l4blk1: 0x00000054 l4blk2: 0x00000000 l4blk3: 0x00000003
    l4blk4: 0x00000000 l4blk5: 0800000054 l4blk6: 0x00000000 l4blk7: 0x00000003
    l4blk8: e14e l4blk9: 0x00000003
    root scb: 0x00000000 act_blk: 0x54f631e0
    ingress/egress intf: BD-VIF7005 (262008), BD-VIF6856 (262013)
    current time 2832693852012 create tstamp: 2830555363585 last access: 2832430393107
    nat_out_local_addr:port: 10.237.208.58:0
    nat_in_global_addr:port: 10.237.208.9:22836
    mpls table id: 0xffffffff
    syncookie fixup: 0x0,  halfopen linkage: 0x00000000 0x00000000
    cxsc_cft_fid: 0x00000000
    tw timer: 0x00000000 0x00000000 0x00000000 0x08dff171
    Packets/session: 25
    bucket 36351 flags 0x00000001 func 1 idx 14 wheel 0x5445dac0
        Timer within range
        num buckets 131072 cur 35632 mask 0x1ffff gran 160 flag 0x0 ticks 0
    SGT: 0 DGT: 0, NAT handles 0x4dfc80c0 0x00000000
    FlowDB in2out 0x00000000 alloc_epoch 0 out2in 0x00000000 alloc_epoch 0
    icmp_err_time 0, avc class stats 0x0, VPN id src 65535, dst 65535

qa-de-1-rt12a#

I see also the following log line telling me, that the reverse OUT2IN zone pair policy is dropping it, yet the IN2OUT zone pair should have created the matching session:

Apr  7 15:40:21.753: %IOSXE-6-PLATFORM: R0/0: cpp_cp: QFP:0.0 Thread:123 TS:00000030098409284429 %FW-6-DROP_PKT: Dropping icmp pkt from BD-VIF7005 10.237.208.58:0 => 10.180.5.10:0(target:class)-(ZP-FWAAS-7e2a161a85fd454f8838bc4de9563333-OUT2IN:class-default) due to Policy drop:classify result with ip ident 26199
Apr  7 15:40:37.355: %IOSXE-6-PLATFORM: R0/0: cpp_cp: QFP:0.1 Thread:169 TS:00000030114011600959 %FW-6-LOG_SUMMARY: 6 icmp packets were dropped from BD-VIF7005 10.237.208.58:0 => 10.180.5.10:0 (target:class)-(ZP-FWAAS-7e2a161a85fd454f8838bc4de9563333-OUT2IN:class-default)

For further debug I did a FIA trace with the following settings:

qa-de-1-rt12a#show ip access-lists  FIA-TRACE-FILTER
Extended IP access list FIA-TRACE-FILTER
    20 permit icmp host 10.180.5.10 host 10.237.208.58
    30 permit icmp host 10.237.208.9 host 10.237.208.58
    40 permit icmp host 10.237.208.58 host 10.237.208.9
    50 permit icmp host 10.237.208.58 host 10.180.5.10

Egress FIA (results in file _egressfia.txt)

qa-de-1-rt12a#debug platform condition ipv4 access-list FIA-TRACE-FILTER egress
qa-de-1-rt12a#debug platform condition start
qa-de-1-rt12a#debug platform condition start
qa-de-1-rt12a#debug platform packet-trace packet 16 fia-trace
qa-de-1-rt12a#debug platform condition stop
qa-de-1-rt12a#show platform packet-trace summary
Pkt   Input                     Output                    State  Reason
0     Port-ch1                  Port-ch1.EFP2048          FWD
1     Port-ch1                  BD-VI6856                 DROP   187 (FirewallPolicy)
2     Port-ch1                  BD-VI6856                 DROP   188 (FirewallL4)
3     Port-ch1                  Port-ch1.EFP2048          FWD
4     Port-ch1                  BD-VI6856                 DROP   187 (FirewallPolicy)
5     Port-ch1                  Port-ch1.EFP2084          FWD
6     Port-ch1                  Port-ch1.EFP2048          FWD
7     Port-ch1                  Port-ch1.EFP2048          FWD
8     Port-ch1                  BD-VI6856                 DROP   187 (FirewallPolicy)
9     Port-ch1                  Port-ch1.EFP2084          FWD
10    Port-ch1                  Port-ch1.EFP2048          FWD
11    Port-ch1                  Port-ch1.EFP2048          FWD
12    Port-ch1                  BD-VI6856                 DROP   187 (FirewallPolicy)
13    Port-ch1                  Port-ch1.EFP2084          FWD
14    Port-ch1                  Port-ch1.EFP2048          FWD
15    Port-ch1                  Port-ch1.EFP2048          FWD

Ingress FIA (results in file _ingressfia.txt)

qa-de-1-rt12a#clear platform condition all
qa-de-1-rt12a#debug platform condition ipv4 access-list FIA-TRACE-FILTER ingress
qa-de-1-rt12a#debug platform condition start
qa-de-1-rt12a#debug platform condition start
qa-de-1-rt12a#debug platform packet-trace packet 16 fia-trace
qa-de-1-rt12a#debug platform condition stopl
qa-de-1-rt12a#show platform packet-trace sum
Pkt   Input                     Output                    State  Reason
0     Port-ch1                  Port-ch1.EFP2084          FWD
1     Port-ch1                  Port-ch1.EFP2048          FWD
2     Port-ch1                  Port-ch1.EFP2048          FWD
3     Port-ch1                  BD-VI6856                 DROP   187 (FirewallPolicy)
4     Port-ch1                  Port-ch1.EFP2084          FWD
5     Port-ch1                  Port-ch1.EFP2048          FWD

The weird thing from these traces here is that the only packet I am always seeing is the return packet, so the ICMP echo-reply, which is policy dropped. Never the ICMP echo-request.

Using a fresh ICMP probe ID

If I restart the blackbox_exporter, hence triggering the use of a fresh ICMP ID, I get replies:x

16:21:31.752319 IP (tos 0x0, ttl 63, id 5218, offset 0, flags [DF], proto ICMP (1), length 56)
    10.180.5.10 > 10.237.208.58: ICMP echo request, id 1413, seq 10146, length 36

            ^^^^
    last one with id 1413

16:21:35.929745 IP (tos 0xa0, ttl 61, id 20749, offset 0, flags [DF], proto ICMP (1), length 56)
    10.237.208.58 > 10.180.5.10: ICMP echo request, id 22836, seq 4650, length 36
16:21:35.929761 IP (tos 0xa0, ttl 64, id 58526, offset 0, flags [none], proto ICMP (1), length 56)
    10.180.5.10 > 10.237.208.58: ICMP echo reply, id 22836, seq 4650, length 36
16:21:41.752244 IP (tos 0x0, ttl 63, id 12263, offset 0, flags [DF], proto ICMP (1), length 56)
    10.180.5.10 > 10.237.208.58: ICMP echo request, id 62171, seq 52436, length 36
16:21:41.753327 IP (tos 0xa0, ttl 62, id 11261, offset 0, flags [none], proto ICMP (1), length 56)
    10.237.208.58 > 10.180.5.10: ICMP echo reply, id 62171, seq 52436, length 36
                ^^^^
    new one with id 62171 being repsoned

And I also see the session in the flow-db:

qa-de-1-rt12a#$tive feature firewall datapath scb 10.180.5.10 any 10.237.208.58 any 1 all any detail
[s=session  i=imprecise channel c=control channel  d=data channel A/D=appfw action allow/deny]
Session ID:0x00222EFF 10.180.5.10 8 10.237.208.58 62171 proto 1 (7e2a161a85fd454f8838bc4de9563333:68:7e2a161a85fd454f8838bc4de9563333:68) (0x3:icmp)    [sc]
 pscb : 0x85d9870,  key1_flags: 0x00000000
    bucket : 6956, prev 0x0, next 0x0
    fw_flags: 0x00000004 0x2043a961,
     VRF1-rsrc-limit
     Root Protocol-ICMP NAT-applied Alert Proto-State:Established No-halfopen-list Active-cnt NAT-applied Session-db Max-session
    icmp_error count 0 ureachable arrived: no
    scb state: active, nxt_timeout: 1000, refcnt: 1
    ha nak cnt: 0,  rg: 1
    hostdb: 0x00000000, L7: 0x, stats: 0x54f6a8c0, child: 0x00000000
    l4blk0: 0x00000000 l4blk1: 0x00000038 l4blk2: 0x00000000 l4blk3: 0x00000002
    l4blk4: 0x00000000 l4blk5: 0800000038 l4blk6: 0x00000000 l4blk7: 0x00000002
    l4blk8: cddd l4blk9: 0x00000003
    root scb: 0x00000000 act_blk: 0x54f63000
    ingress/egress intf: BD-VIF6856 (262029), BD-VIF7005 (261992)
    current time 3059356041997 create tstamp: 3057985156499 last access: 3058923061307
    nat_out_local_addr:port: 10.237.208.58:0
    nat_in_global_addr:port: 10.237.208.9:0
    mpls table id: 0xffffffff
    syncookie fixup: 0x0,  halfopen linkage: 0x00000000 0x00000000
    cxsc_cft_fid: 0x00000000
    tw timer: 0x53383260 0x5377f4a0 0x00000000 0x03db9169
    Packets/session: 25
    bucket 15801 flags 0x00000001 func 1 idx 13 wheel 0x543dda90
        Timer within range
        num buckets 131072 cur 15262 mask 0x1ffff gran 160 flag 0x0 ticks 0
    SGT: 0 DGT: 0, NAT handles 0x4e12d3c0 0x00000000
    FlowDB in2out 0x00000000 alloc_epoch 0 out2in 0x00000000 alloc_epoch 0
    icmp_err_time 0, avc class stats 0x0, VPN id src 65535, dst 65535

FIA Traces of the working example below:

Ingress (_ingress_fiaworking.txt)

qa-de-1-rt12a#show platform packet-trace summary
Pkt   Input                     Output                    State  Reason
0     Port-ch1                  Port-ch1.EFP2048          FWD
1     Port-ch1                  Port-ch1.EFP2084          FWD
2     Port-ch1                  Port-ch1.EFP2084          FWD
3     Port-ch1                  Port-ch1.EFP2048          FWD
4     Port-ch1                  Port-ch1.EFP2048          FWD
5     Port-ch1                  Port-ch1.EFP2084          FWD
6     Port-ch1                  Port-ch1.EFP2084          FWD
7     Port-ch1                  Port-ch1.EFP2048          FWD

Egress (_egress_fiaworking.txt)

qa-de-1-rt12a#show pla packet-trace summary
Pkt   Input                     Output                    State  Reason
0     BD-VI6856                 Port-ch1.EFP2048          FWD
1     BD-VI7005                 Port-ch1.EFP2084          FWD
2     BD-VI7005                 Port-ch1.EFP2084          FWD
3     BD-VI6856                 Port-ch1.EFP2048          FWD
4     BD-VI6856                 Port-ch1.EFP2048          FWD
5     BD-VI7005                 Port-ch1.EFP2084          FWD
6     BD-VI7005                 Port-ch1.EFP2084          FWD
7     BD-VI6856                 Port-ch1.EFP2048          FWD
swagner-de commented 3 years ago

I am currently unable to reproduce this problem in our QA environment and hence have put it on hold.