sbezverk / gobmp

go based implementation of BGP's BMP protocol
Apache License 2.0
85 stars 28 forks source link

Problem with IPV6 route monitoring and kafka publication #173

Closed vkotronis closed 3 years ago

vkotronis commented 3 years ago

Hello, first of all thank you for this great contribution!

I am using the latest version of the BMP collector. I have connected it with a gobgp software router for route mirroring and with a Kafka broker for BMP message publishing and everything seems to work fine for IPv4. I am monitoring both AF topics gobmp.parsed.unicast_prefix_v4|6. However, I do not see anything on v6 despite the fact that routes are properly injected in the router and reach the BMP collector. I.e., I see Kafka messages of the form:

...  | % Message on gobmp.parsed.unicast_prefix_v4[0]@0:
...  | {"action":"add","router_hash":"a84c2cf187498ba4cff7b28bd6e928ea","router_ip":"10.152.183.12","base_attrs":{"base_attr_hash":"c06877b920b5e1b6485edf518a457fee","origin":"incomplete","as_path":[65001,65003],"as_path_count":2,"nexthop":"10.152.183.11","is_atomic_agg":false},"peer_hash":"ebbe4bfa2301281a9ef5c720528b1be6","peer_ip":"10.152.183.11","peer_asn":65001,"timestamp":"Jul  1 09:37:35.000000","prefix":"192.168.0.0","prefix_len":16,"is_ipv4":true,"origin_as":65003,"nexthop":"10.152.183.11","is_nexthop_ipv4":true,"is_prepolicy":false,"is_adj_rib_in":false}

when injecting v4 routes, but nothing on IPv6. Is there a way to debug what is happening? Can it be related to the following line https://github.com/sbezverk/gobmp/blob/master/pkg/message/route-monitor.go#L59 where the default topic is v4 only and it is not examined whether v6 should be set there? Any help/feedback is appreciated.

Btw here is my bmp-collector docker service configuration:

 ...
 bmp-collector:
    image: sbezverk/gobmp:latest
    container_name: bmp-collector
    restart: always
    networks:
      backend
    expose:
      - 5000
      - 56767
    entrypoint: ["/gobmp", "--source-port=5000", "--intercept=false", "--split-af=true", "--kafka-server=broker:9092"]
 ...
vkotronis commented 3 years ago

update: running in debug mode with everything dumped on console:

entrypoint: ["/gobmp", "-v=6", "--dump=console", "--source-port=5000", "--intercept=false", "--split-af=true", "--kafka-server=broker:9092"]

I see the following for a v4 prefix (in this case: 192.168.0.0/16):

bmp-collector              | I0701 10:21:54.553251       1 common-header.go:25] BMP CommonHeader Raw: [ 0x03, 0x00, 0x00, 0x00, 0x62, 0x00 ]
bmp-collector              | I0701 10:21:54.553464       1 common-header.go:25] BMP CommonHeader Raw: [ 0x03, 0x00, 0x00, 0x00, 0x62, 0x00 ]
bmp-collector              | I0701 10:21:54.553549       1 per-peer-header.go:73] BMP Per Peer Header Raw: [ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x0A, 0x98, 0xB7, 0x0B, 0x00, 0x00, 0xFD, 0xE9, 0x0A, 0x98, 0xB7, 0x0B, 0x60, 0xDD, 0x97, 0x42, 0x00, 0x00, 0x00, 0x00 ]
bmp-collector              | I0701 10:21:54.553795       1 route-monitor.go:19] BMP Route Monitor Message Raw: [ 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x00, 0x32, 0x02, 0x00, 0x00, 0x00, 0x18, 0x40, 0x01, 0x01, 0x02, 0x40, 0x02, 0x0A, 0x02, 0x02, 0x00, 0x00, 0xFD, 0xE9, 0x00, 0x00, 0xFD, 0xEB, 0x40, 0x03, 0x04, 0x0A, 0x98, 0xB7, 0x0B, 0x10, 0xC0, 0xA8 ] length: 50
bmp-collector              | I0701 10:21:54.553893       1 bgp-update.go:113] BGPUpdate Raw: [ 0x00, 0x00, 0x00, 0x18, 0x40, 0x01, 0x01, 0x02, 0x40, 0x02, 0x0A, 0x02, 0x02, 0x00, 0x00, 0xFD, 0xE9, 0x00, 0x00, 0xFD, 0xEB, 0x40, 0x03, 0x04, 0x0A, 0x98, 0xB7, 0x0B, 0x10, 0xC0, 0xA8 ]
bmp-collector              | I0701 10:21:54.553956       1 route.go:29] Routes Raw: []
bmp-collector              | I0701 10:21:54.554010       1 bgp-path-attribute.go:21] BGPPathAttributes Raw: [ 0x40, 0x01, 0x01, 0x02, 0x40, 0x02, 0x0A, 0x02, 0x02, 0x00, 0x00, 0xFD, 0xE9, 0x00, 0x00, 0xFD, 0xEB, 0x40, 0x03, 0x04, 0x0A, 0x98, 0xB7, 0x0B ]
bmp-collector              | I0701 10:21:54.554095       1 bgp-base-attributes.go:50] UnmarshalBGPBaseAttributes RAW: [ 0x40, 0x01, 0x01, 0x02, 0x40, 0x02, 0x0A, 0x02, 0x02, 0x00, 0x00, 0xFD, 0xE9, 0x00, 0x00, 0xFD, 0xEB, 0x40, 0x03, 0x04, 0x0A, 0x98, 0xB7, 0x0B ]
bmp-collector              | I0701 10:21:54.554211       1 route.go:29] Routes Raw: [ 0x10, 0xC0, 0xA8 ]
bmp-collector              | gobmp: 10:21:54.554443 {MsgType:74 MsgHash:a84c2cf187498ba4cff7b28bd6e928ea Msg:{"action":"add","router_hash":"a84c2cf187498ba4cff7b28bd6e928ea","router_ip":"10.152.183.12","base_attrs":{"base_attr_hash":"c06877b920b5e1b6485edf518a457fee","origin":"incomplete","as_path":[65001,65003],"as_path_count":2,"nexthop":"10.152.183.11","is_atomic_agg":false},"peer_hash":"ebbe4bfa2301281a9ef5c720528b1be6","peer_ip":"10.152.183.11","peer_asn":65001,"timestamp":"Jul  1 10:21:54.000000","prefix":"192.168.0.0","prefix_len":16,"is_ipv4":true,"origin_as":65003,"nexthop":"10.152.183.11","is_nexthop_ipv4":true,"is_prepolicy":false,"is_adj_rib_in":false}}

while I see the following for a v6 prefix (in this case: 2001::/16):


bmp-collector              | I0701 10:23:27.559323       1 common-header.go:25] BMP CommonHeader Raw: [ 0x03, 0x00, 0x00, 0x00, 0x73, 0x00 ]
bmp-collector              | I0701 10:23:27.559443       1 common-header.go:25] BMP CommonHeader Raw: [ 0x03, 0x00, 0x00, 0x00, 0x73, 0x00 ]
bmp-collector              | I0701 10:23:27.559488       1 per-peer-header.go:73] BMP Per Peer Header Raw: [ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x0A, 0x98, 0xB7, 0x0B, 0x00, 0x00, 0xFD, 0xE9, 0x0A, 0x98, 0xB7, 0x0B, 0x60, 0xDD, 0x97, 0x9F, 0x00, 0x00, 0x00, 0x00 ]
bmp-collector              | I0701 10:23:27.559528       1 route-monitor.go:19] BMP Route Monitor Message Raw: [ 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x00, 0x43, 0x02, 0x00, 0x00, 0x00, 0x2C, 0x40, 0x01, 0x01, 0x02, 0x40, 0x02, 0x0A, 0x02, 0x02, 0x00, 0x00, 0xFD, 0xE9, 0x00, 0x00, 0xFD, 0xEB, 0x80, 0x0E, 0x18, 0x00, 0x02, 0x01, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0x0A, 0x98, 0xB7, 0x0B, 0x00, 0x10, 0x20, 0x01 ] length: 67
bmp-collector              | I0701 10:23:27.559595       1 bgp-update.go:113] BGPUpdate Raw: [ 0x00, 0x00, 0x00, 0x2C, 0x40, 0x01, 0x01, 0x02, 0x40, 0x02, 0x0A, 0x02, 0x02, 0x00, 0x00, 0xFD, 0xE9, 0x00, 0x00, 0xFD, 0xEB, 0x80, 0x0E, 0x18, 0x00, 0x02, 0x01, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0x0A, 0x98, 0xB7, 0x0B, 0x00, 0x10, 0x20, 0x01 ]
bmp-collector              | I0701 10:23:27.559615       1 route.go:29] Routes Raw: []
bmp-collector              | I0701 10:23:27.559646       1 bgp-path-attribute.go:21] BGPPathAttributes Raw: [ 0x40, 0x01, 0x01, 0x02, 0x40, 0x02, 0x0A, 0x02, 0x02, 0x00, 0x00, 0xFD, 0xE9, 0x00, 0x00, 0xFD, 0xEB, 0x80, 0x0E, 0x18, 0x00, 0x02, 0x01, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0x0A, 0x98, 0xB7, 0x0B, 0x00, 0x10, 0x20, 0x01 ]
bmp-collector              | I0701 10:23:27.559694       1 bgp-base-attributes.go:50] UnmarshalBGPBaseAttributes RAW: [ 0x40, 0x01, 0x01, 0x02, 0x40, 0x02, 0x0A, 0x02, 0x02, 0x00, 0x00, 0xFD, 0xE9, 0x00, 0x00, 0xFD, 0xEB, 0x80, 0x0E, 0x18, 0x00, 0x02, 0x01, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0x0A, 0x98, 0xB7, 0x0B, 0x00, 0x10, 0x20, 0x01 ]
bmp-collector              | I0701 10:23:27.559745       1 route.go:29] Routes Raw: []

So the problem seems to start at route.go:29: for v4 I see the hex route representation while for v6 the list is empty. Then, in v4 I see the full message to be published to kafka while on v6 nothing. I am not sure what is going wrong here since both v4 and v6 routes reach the container; there must be some point at the code where the v6 chain to kafka beaks. Till now I have managed to spot only this potential problem here: https://github.com/sbezverk/gobmp/blob/master/pkg/message/route-monitor.go#L59 (no v4/v6 differentiation) but maybe it is sth else.

sbezverk commented 3 years ago

Hello, thank you for reporting the issue, I will look into it and let you know. Now with regards to your comment: Till now I have managed to spot only this potential problem here: https://github.com/sbezverk/gobmp/blob/master/pkg/message/route-monitor.go#L59 (no v4/v6 differentiation) but maybe it is sth else. The place in the code you pointed out, process old NLRI information which is only ipv4, IPv6 comes in as MP_BGP and processed here: https://github.com/sbezverk/gobmp/blob/master/pkg/message/process-mp-update.go#L50

BTW, thank you for providing hex dump of bgp update, I will use it to debug this issue.

vkotronis commented 3 years ago

Great, thanks for your immediate response! Makes sense about old ipv4-only NLRI. Let me know please in case you need any more debug info.

sbezverk commented 3 years ago

@vkotronis I am puzzled a bit as I used your hex dump of BGP Update and plugged it into all unit tests and it seems the processing is happening correctly. Please check the log for any errors, just do grep for E0701. Also I have build debug image please try to run and collect logs. The image is sbezverk/gobmp:issue_173

vkotronis commented 3 years ago

OK will check with the debug image and get back to the thread. Have you also verified that the Kafka publisher correctly sends the IPv6 update? Because this is what is missing from the initial processing output (working only for v4).

vkotronis commented 3 years ago

Debug image, v4 192.168.0.0/16 route addition:

bmp-collector              | I0701 13:12:15.710508       1 common-header.go:25] BMP CommonHeader Raw: [ 0x03, 0x00, 0x00, 0x00, 0x62, 0x00 ]
bmp-collector              | I0701 13:12:15.710569       1 common-header.go:25] BMP CommonHeader Raw: [ 0x03, 0x00, 0x00, 0x00, 0x62, 0x00 ]
bmp-collector              | I0701 13:12:15.710591       1 per-peer-header.go:73] BMP Per Peer Header Raw: [ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x0A, 0x98, 0xB7, 0x0B, 0x00, 0x00, 0xFD, 0xE9, 0x0A, 0x98, 0xB7, 0x0B, 0x60, 0xDD, 0xBF, 0x2F, 0x00, 0x00, 0x00, 0x00 ]
bmp-collector              | I0701 13:12:15.710626       1 route-monitor.go:19] BMP Route Monitor Message Raw: [ 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x00, 0x32, 0x02, 0x00, 0x00, 0x00, 0x18, 0x40, 0x01, 0x01, 0x02, 0x40, 0x02, 0x0A, 0x02, 0x02, 0x00, 0x00, 0xFD, 0xE9, 0x00, 0x00, 0xFD, 0xEB, 0x40, 0x03, 0x04, 0x0A, 0x98, 0xB7, 0x0B, 0x10, 0xC0, 0xA8 ] length: 50
bmp-collector              | I0701 13:12:15.710648       1 bgp-update.go:113] BGPUpdate Raw: [ 0x00, 0x00, 0x00, 0x18, 0x40, 0x01, 0x01, 0x02, 0x40, 0x02, 0x0A, 0x02, 0x02, 0x00, 0x00, 0xFD, 0xE9, 0x00, 0x00, 0xFD, 0xEB, 0x40, 0x03, 0x04, 0x0A, 0x98, 0xB7, 0x0B, 0x10, 0xC0, 0xA8 ]
bmp-collector              | I0701 13:12:15.710656       1 route.go:29] Routes Raw: []
bmp-collector              | I0701 13:12:15.710669       1 bgp-path-attribute.go:21] BGPPathAttributes Raw: [ 0x40, 0x01, 0x01, 0x02, 0x40, 0x02, 0x0A, 0x02, 0x02, 0x00, 0x00, 0xFD, 0xE9, 0x00, 0x00, 0xFD, 0xEB, 0x40, 0x03, 0x04, 0x0A, 0x98, 0xB7, 0x0B ]
bmp-collector              | I0701 13:12:15.710694       1 bgp-base-attributes.go:50] UnmarshalBGPBaseAttributes RAW: [ 0x40, 0x01, 0x01, 0x02, 0x40, 0x02, 0x0A, 0x02, 0x02, 0x00, 0x00, 0xFD, 0xE9, 0x00, 0x00, 0xFD, 0xEB, 0x40, 0x03, 0x04, 0x0A, 0x98, 0xB7, 0x0B ]
bmp-collector              | gobmp: 13:12:15.711244 {MsgType:74 MsgHash:a84c2cf187498ba4cff7b28bd6e928ea Msg:{"action":"add","router_hash":"a84c2cf187498ba4cff7b28bd6e928ea","router_ip":"10.152.183.12","base_attrs":{"base_attr_hash":"c06877b920b5e1b6485edf518a457fee","origin":"incomplete","as_path":[65001,65003],"as_path_count":2,"nexthop":"10.152.183.11","is_atomic_agg":false},"peer_hash":"ebbe4bfa2301281a9ef5c720528b1be6","peer_ip":"10.152.183.11","peer_asn":65001,"timestamp":"Jul  1 13:12:15.000000","prefix":"192.168.0.0","prefix_len":16,"is_ipv4":true,"origin_as":65003,"nexthop":"10.152.183.11","is_nexthop_ipv4":true,"is_prepolicy":false,"is_adj_rib_in":false}}
bmp-collector              | I0701 13:12:15.710913       1 route.go:29] Routes Raw: [ 0x10, 0xC0, 0xA8 ]

Debug image, v6 2001::/16 route addition:

bmp-collector              | I0701 13:12:37.686725       1 common-header.go:25] BMP CommonHeader Raw: [ 0x03, 0x00, 0x00, 0x00, 0x73, 0x00 ]
bmp-collector              | I0701 13:12:37.686797       1 common-header.go:25] BMP CommonHeader Raw: [ 0x03, 0x00, 0x00, 0x00, 0x73, 0x00 ]
bmp-collector              | I0701 13:12:37.686826       1 per-peer-header.go:73] BMP Per Peer Header Raw: [ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x0A, 0x98, 0xB7, 0x0B, 0x00, 0x00, 0xFD, 0xE9, 0x0A, 0x98, 0xB7, 0x0B, 0x60, 0xDD, 0xBF, 0x45, 0x00, 0x00, 0x00, 0x00 ]
bmp-collector              | I0701 13:12:37.686848       1 route-monitor.go:19] BMP Route Monitor Message Raw: [ 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x00, 0x43, 0x02, 0x00, 0x00, 0x00, 0x2C, 0x40, 0x01, 0x01, 0x02, 0x40, 0x02, 0x0A, 0x02, 0x02, 0x00, 0x00, 0xFD, 0xE9, 0x00, 0x00, 0xFD, 0xEB, 0x80, 0x0E, 0x18, 0x00, 0x02, 0x01, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0x0A, 0x98, 0xB7, 0x0B, 0x00, 0x10, 0x20, 0x01 ] length: 67
bmp-collector              | I0701 13:12:37.686861       1 bgp-update.go:113] BGPUpdate Raw: [ 0x00, 0x00, 0x00, 0x2C, 0x40, 0x01, 0x01, 0x02, 0x40, 0x02, 0x0A, 0x02, 0x02, 0x00, 0x00, 0xFD, 0xE9, 0x00, 0x00, 0xFD, 0xEB, 0x80, 0x0E, 0x18, 0x00, 0x02, 0x01, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0x0A, 0x98, 0xB7, 0x0B, 0x00, 0x10, 0x20, 0x01 ]
bmp-collector              | I0701 13:12:37.686867       1 route.go:29] Routes Raw: []
bmp-collector              | I0701 13:12:37.686878       1 bgp-path-attribute.go:21] BGPPathAttributes Raw: [ 0x40, 0x01, 0x01, 0x02, 0x40, 0x02, 0x0A, 0x02, 0x02, 0x00, 0x00, 0xFD, 0xE9, 0x00, 0x00, 0xFD, 0xEB, 0x80, 0x0E, 0x18, 0x00, 0x02, 0x01, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0x0A, 0x98, 0xB7, 0x0B, 0x00, 0x10, 0x20, 0x01 ]
bmp-collector              | I0701 13:12:37.686891       1 bgp-base-attributes.go:50] UnmarshalBGPBaseAttributes RAW: [ 0x40, 0x01, 0x01, 0x02, 0x40, 0x02, 0x0A, 0x02, 0x02, 0x00, 0x00, 0xFD, 0xE9, 0x00, 0x00, 0xFD, 0xEB, 0x80, 0x0E, 0x18, 0x00, 0x02, 0x01, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0x0A, 0x98, 0xB7, 0x0B, 0x00, 0x10, 0x20, 0x01 ]
bmp-collector

The only E0701 errors I see are the following:

bmp-collector              | E0701 13:11:51.167909       1 gobmpsrv.go:89] fail to read from client 192.168.144.12:45427 with error: EOF

where the 192.168.144.12 IP address corresponds to my kafka consumer (go client based on github.com/confluentinc/confluent-kafka-go). However how can only v6 prefixes be affected? I am subscribing to both needed topics both in the broker's side:

  broker:
    image: wurstmeister/kafka:1.1.0
    restart: always
    depends_on:
      - zookeeper
    networks:
      - backend
    expose:
      - 9092
    environment:
      ...
      KAFKA_CREATE_TOPICS: "gobmp.parsed.unicast_prefix_v4:1:1:compact,gobmp.parsed.unicast_prefix_v6:1:1:compact"

and on the consumer's side:

err = consumer.SubscribeTopics([]string{"gobmp.parsed.unicast_prefix_v4", "gobmp.parsed.unicast_prefix_v6"}, nil)

Would it be possible to try this on the bmp-collector's side and check if it actually publishes to kafka? It seems that the problem is somewhere in the bmp collector's post-processing --> Kafka publication chain.

sbezverk commented 3 years ago

@vkotronis the image I built has extra output confirming that it publishes message to kafka, please check the log with new image. Also you do not need to create topics, they will be created on gobmp start.

sbezverk commented 3 years ago

@vkotronis sorry I missed the output I was looking for. So the publishing func does not get called for ipv6. I will dig further.

vkotronis commented 3 years ago

Correct that seems to be the main problem. I also did not find anything related to IPv6 kafka publication in the logs (accounts only for v4 messages). The weird thing is the absence of errors, meaning that it silently fails at some point of the logical pub chain.

sbezverk commented 3 years ago

@vkotronis I brought up the same build of gobmp and I am getting ipv6 messages printed.

I0701 10:01:23.258804  103948 route-monitor.go:96] message of type: 76 json: {"action":"add","router_hash":"d8dcf945784299c002acb9c2af1e803e","router_ip":"2001:128:128:128::1","base_attrs":{"base_attr_hash":"d40dcdbdb9d2f0ad6fef1efda93b5c49","origin":"incomplete","local_pref":100,"is_atomic_agg":false},"peer_hash":"df70d98d66bc1efcde5aa818712fae2d","peer_ip":"2001:129:129:129::1","peer_asn":65000,"timestamp":"Jul  1 10:01:14.000907","prefix":"2:2::","prefix_len":48,"is_ipv4":false,"nexthop":"2001:129:129:129::1","is_nexthop_ipv4":false,"is_prepolicy":false,"is_adj_rib_in":false}

Could you try gobmp --v=5 --dump=console to see if you still cannot see ipv6 messages.

I do not have kafka handy to try, but that would be my next step if you see ipv6 messages printed to the console.

vkotronis commented 3 years ago

Yes just tried but nothing is printed on console for ipv6, only for ipv4. I checked with your prefix too 2:2::/48. The difference in our debug setup is I think that you are using also ipv6 router IPs. Can this be the issue? I have configured the router that talks to the BMP collector to also carry the v6 AF, while its IP is actually v4. Is there maybe a condition somewhere in the code that requires that the AF of the carried prefix and the router IP should be the same for processing to continue? As far as I understand, processing works fine until it is time to publish this on the collector's side.

sbezverk commented 3 years ago

Here is where the branching for publishing happens: https://github.com/sbezverk/gobmp/blob/master/pkg/message/process-mp-update.go#L50

m.IsIPv4 is set here:

https://github.com/sbezverk/gobmp/blob/master/pkg/message/mp-unicast.go#L62

As long as AF id is 2 then update will be treated as IPv6, I do not see at least so far how your router by just using ipv4 address could cause any issues.

Here is the call to check if it is IPv6 or not. https://github.com/sbezverk/gobmp/blob/173aa520c4f91629d478db5472ae9c3936d53b9c/pkg/bgp/mp-reach-nlri.go#L37

What kind of router you use? Can you share config? At this point I cannot reproduce the issue, so having these two bits of info could help to move forward.

vkotronis commented 3 years ago

I am using a gobgp software router. https://github.com/osrg/gobgp Configuration is simple:

[global.config]
  as = 65001
  router-id = "..."

[[neighbors]]
  [neighbors.config]
    neighbor-address = "..."
    peer-as = 65002
  [neighbors.ebgp-multihop.config]
    enabled = true
    multihop-ttl = 2
  [[neighbors.afi-safis]]
    [neighbors.afi-safis.config]
    afi-safi-name = "ipv4-unicast"
  [[neighbors.afi-safis]]
    [neighbors.afi-safis.config]
    afi-safi-name = "ipv6-unicast"

[[bmp-servers]]
  [bmp-servers.config]
    address = "bmp-collector"
    port=5000
vkotronis commented 3 years ago

So to replicate (of course if you have time, I understand this is getting more complex), you can connect a gobgp soft router to the gobmp collector and advertise sth to the router, which will eventually be mirrored to the collector. You can use only v4 addresses to be sure. With IPv4 no problem, IPv6 prefix announcements reach and are processed in the collector, but I cannot get the publishing step to work for some reason, despite the fact that the processing should have worked up to that point.

sbezverk commented 3 years ago

I brought up gobgp instance and gobmp correctly process messages it receives, see below:

I0701 11:31:00.374272  103948 route-monitor.go:96] message of type: 76 json: {"action":"add","router_hash":"346d9a054be6acba75af367f7e95c9ba","router_ip":"192.168.20.201","base_attrs":{"base_attr_hash":"e1d42f05a36c0c367ebea8e560962cb1","origin":"incomplete","as_path":[65000],"as_path_count":1,"is_atomic_agg":false},"peer_hash":"7bc755d93ef02aab3e1716cc26e48211","peer_ip":"192.168.20.129","peer_asn":65000,"timestamp":"Jul  1 15:31:00.000000","prefix":"2:2::","prefix_len":48,"is_ipv4":false,"origin_as":65000,"nexthop":"2001:470:b16e:20::129,fe80::5054:ff:fe17:4c85","is_nexthop_ipv4":false,"is_prepolicy":false,"is_adj_rib_in":false}

Unfortunately I cannot strip all ipv6 address in my test bed. The session above is just ipv4 session with ipv4/ipv6 unicast address families enabled This is as far as I can go. Alternatively I propose you, since you can consistently reproduce it, to add bunch of glog.Infof() to see the flow and to find out where the flow breaks.

vkotronis commented 3 years ago

OK I will check in case there is sth problematic with my local setup. Since you correctly see these messages it probably is not a bmp collector issue. I will leave the issue open for a few days if you don't mind to eradicate all possibilities and will close afterwards. Many thanks @sbezverk !

sbezverk commented 3 years ago

@vkotronis sure thing, we can keep it opened until all means to get RC are exhausted :) good luck!

vkotronis commented 3 years ago

short update by checking offline: with some glog.Infof() messages, it seems that the IPv6 route falls in the default case https://github.com/sbezverk/gobmp/blob/master/pkg/message/route-monitor.go#L56 while according to our discussion it should be processed by processMPUpdate which does not happen here. In the default case, as discussed, the default message is always v4, so discarded. Could you maybe send me the exact configuration for your gobgp with ipv4/ipv6 unicast address families enabled? I just do not understand why the route would fall to the old NLRI category, instead of being processed through the correct pipeline. Thanks in advance!

vkotronis commented 3 years ago

To follow-up, one possible explanation is that the attrType is not correctly set in the update (not a fault of the collector, but the generator). Not sure if you have a different configuration thus your gobgp instance produces a correct MP NLRI. Very strange case!

sbezverk commented 3 years ago

Here you go:

  as = 65001
  router-id = "192.168.20.201"

[[neighbors]]
  [neighbors.config]
    neighbor-address = "192.168.20.129"
    peer-as = 65000
  [neighbors.ebgp-multihop.config]
    enabled = true
    multihop-ttl = 2
  [[neighbors.afi-safis]]
    [neighbors.afi-safis.config]
    afi-safi-name = "ipv4-unicast"
  [[neighbors.afi-safis]]
    [neighbors.afi-safis.config]
    afi-safi-name = "ipv6-unicast"

[[bmp-servers]]
  [bmp-servers.config]
    address = "192.168.20.201"
    port=5000

I use latest gobgp.

The other end of bgp session is cisco xrv9k virtual router.

vkotronis commented 3 years ago

Perfect seems we are in sync there. And what test command do you use for gobgp to produce the route? I assume gobgp global rib add 2001::/48 -a ipv6 or similar? Again thanks for the timely responses, it is weird that the MP NLRI attribute does not reach the collector properly (and is thus processed by the old NLRI chain). Any ideas on the latter? Could it be sth in the message that is not formatted properly?

sbezverk commented 3 years ago

@vkotronis I do not advertise prefix from gobgp, I inject from the cisco side, it is easier. You need to capture hex dump of the update, where ipv6 prefix goes into old nlri and share it with me, I will run it by our bgp folks just to get their input. Frankly I have no clue why ipv6 would end up in nlri instead of mp reach.

vkotronis commented 3 years ago

I think I found what is going on. The hex message is the same I sen you in the original report:

BGPUpdate Raw: [ 0x00, 0x00, 0x00, 0x2C, 0x40, 0x01, 0x01, 0x02, 0x40, 0x02, 0x0A, 0x02, 0x02, 0x00, 0x00, 0xFD, 0xE9, 0x00, 0x00, 0xFD, 0xEB, 0x80, 0x0E, 0x18, 0x00, 0x02, 0x01, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0x0A, 0x98, 0xB7, 0x0B, 0x00, 0x10, 0x20, 0x01 ]

This is injected from another gobgp router into a gobgp-based RS that talks with goBMP. The message is valid, and I see that you have included it in a test in branch issue_173 https://github.com/sbezverk/gobmp/compare/issue_173#diff-63afd0cfe9f841843b1b9c1a6c3314095102bad34f9d5b8334d35a071257ea66R18

However, let's examine the path attrs of that expected message. In the test we have 3 of them:

PathAttributes: []PathAttribute{
                    {
                        AttributeTypeFlags: 64,
                        AttributeType:      1,
                        AttributeLength:    1,
                        Attribute:          []byte{2},
                    },
                    {
                        AttributeTypeFlags: 64,
                        AttributeType:      2,
                        AttributeLength:    10,
                        Attribute:          []byte{2, 2, 0, 0, 253, 233, 0, 0, 253, 235},
                    },
                    {
                        AttributeTypeFlags: 128,
                        AttributeType:      14,
                        AttributeLength:    24,
                        Attribute:          []byte{0, 2, 1, 16, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 255, 255, 10, 152, 183, 11, 0, 16, 32, 1},
                    },
                }

However, the route_monitor.go seems to make a decision on the NLRI parser to use based only on the first attribute: https://github.com/sbezverk/gobmp/blob/master/pkg/message/route-monitor.go#L39

I see in the switch statement that you compare then that type with 14 (MP_REACH_NLRI), 15 (`MP_UNREACH_NLRI) and default (old NLRI processing). Maybe the problem is that the attribute type switch statement uses only the first attribute for determining the processing chain? In this particular case the MP REACH NLRI differentiator (14) is in the third place, not the first one. What should be the correct behavior regarding processing in that case (multiple path attributes), so that this is taken into account? How can the correct path attribute be determined as execution pipeline differentiator?

vkotronis commented 3 years ago

So in that case according also to https://www.iana.org/assignments/bgp-parameters/bgp-parameters.xhtml#bgp-parameters-2 the actual update carries a list of 3 attributes: 1 - ORIGIN 2 - AS_PATH 14 - MP_REACH_NLRI I assume the differentiation then needs to take place based on the MP_REACH_NLRI one, but instead falls in the default case of 1 (so it is not further processed as an MP_REACH_NLRI message and thus it never gets published to kafka or console).

sbezverk commented 3 years ago

@vkotronis Please try this image docker.io/sbezverk/gobmp:issue_173 I added a method to bgp update to search for nlri type in the path attribute and return it if found, otherwise it will fall back to 0 which is old bgp4 nlri.

vkotronis commented 3 years ago

Great! I tried the new image with an injected pseudo 2001::/16 route. I get the following panic error

bmp-collector              | I0707 16:53:16.200425       1 common-header.go:25] BMP CommonHeader Raw: [ 0x03, 0x00, 0x00, 0x00, 0x73, 0x00 ]
bmp-collector              | I0707 16:53:16.200457       1 common-header.go:25] BMP CommonHeader Raw: [ 0x03, 0x00, 0x00, 0x00, 0x73, 0x00 ]
bmp-collector              | I0707 16:53:16.200465       1 per-peer-header.go:73] BMP Per Peer Header Raw: [ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x0A, 0x98, 0xB7, 0x0B, 0x00, 0x00, 0xFD, 0xE9, 0x0A, 0x98, 0xB7, 0x0B, 0x60, 0xE5, 0xDB, 0xFC, 0x00, 0x00, 0x00, 0x00 ]
bmp-collector              | I0707 16:53:16.200476       1 route-monitor.go:19] BMP Route Monitor Message Raw: [ 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x00, 0x43, 0x02, 0x00, 0x00, 0x00, 0x2C, 0x40, 0x01, 0x01, 0x02, 0x40, 0x02, 0x0A, 0x02, 0x02, 0x00, 0x00, 0xFD, 0xE9, 0x00, 0x00, 0xFD, 0xEB, 0x80, 0x0E, 0x18, 0x00, 0x02, 0x01, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0x0A, 0x98, 0xB7, 0x0B, 0x00, 0x10, 0x20, 0x01 ] length: 67
bmp-collector              | I0707 16:53:16.200485       1 bgp-update.go:136] BGPUpdate Raw: [ 0x00, 0x00, 0x00, 0x2C, 0x40, 0x01, 0x01, 0x02, 0x40, 0x02, 0x0A, 0x02, 0x02, 0x00, 0x00, 0xFD, 0xE9, 0x00, 0x00, 0xFD, 0xEB, 0x80, 0x0E, 0x18, 0x00, 0x02, 0x01, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0x0A, 0x98, 0xB7, 0x0B, 0x00, 0x10, 0x20, 0x01 ]
bmp-collector              | I0707 16:53:16.200489       1 route.go:29] Routes Raw: []
bmp-collector              | I0707 16:53:16.200498       1 bgp-path-attribute.go:21] BGPPathAttributes Raw: [ 0x40, 0x01, 0x01, 0x02, 0x40, 0x02, 0x0A, 0x02, 0x02, 0x00, 0x00, 0xFD, 0xE9, 0x00, 0x00, 0xFD, 0xEB, 0x80, 0x0E, 0x18, 0x00, 0x02, 0x01, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0x0A, 0x98, 0xB7, 0x0B, 0x00, 0x10, 0x20, 0x01 ]
bmp-collector              | I0707 16:53:16.200508       1 bgp-base-attributes.go:50] UnmarshalBGPBaseAttributes RAW: [ 0x40, 0x01, 0x01, 0x02, 0x40, 0x02, 0x0A, 0x02, 0x02, 0x00, 0x00, 0xFD, 0xE9, 0x00, 0x00, 0xFD, 0xEB, 0x80, 0x0E, 0x18, 0x00, 0x02, 0x01, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0x0A, 0x98, 0xB7, 0x0B, 0x00, 0x10, 0x20, 0x01 ]
bmp-collector              | I0707 16:53:16.200524       1 route.go:29] Routes Raw: []
bmp-collector              | I0707 16:53:16.200536       1 mp-reach-nlri.go:179] MPReachNLRI Raw: [ 0x02 ]
bmp-collector              | panic: runtime error: slice bounds out of range [:2] with capacity 1
bmp-collector              | 
bmp-collector              | goroutine 24 [running]:
bmp-collector              | github.com/sbezverk/gobmp/pkg/bgp.UnmarshalMPReachNLRI(0xc00015b040, 0x1, 0x1, 0xc000053000, 0x73, 0x6, 0x0, 0x0)
bmp-collector              |    /home/sbezverk/projects/go/workspace/gobmp/pkg/bgp/mp-reach-nlri.go:188 +0x485
bmp-collector              | github.com/sbezverk/gobmp/pkg/message.(*producer).produceRouteMonitorMessage(0xc000020d00, 0xc0000dce70, 0x80e4a0, 0xc00000e478)
bmp-collector              |    /home/sbezverk/projects/go/workspace/gobmp/pkg/message/route-monitor.go:44 +0x775
bmp-collector              | github.com/sbezverk/gobmp/pkg/message.(*producer).producingWorker(0xc000020d00, 0xc0000dce70, 0x80e4a0, 0xc00000e478)
bmp-collector              |    /home/sbezverk/projects/go/workspace/gobmp/pkg/message/producer.go:48 +0x195
bmp-collector              | created by github.com/sbezverk/gobmp/pkg/message.(*producer).Producer
bmp-collector              |    /home/sbezverk/projects/go/workspace/gobmp/pkg/message/producer.go:33 +0x74

Somehow the slicing access is not done correctly. However, we are in a good track since at least the MP Reach toolchain seems to get accessed.

sbezverk commented 3 years ago

@vkotronis my bad, sorry, please try one more time the same image tag

vkotronis commented 3 years ago

@sbezverk I tried, works like a charm :) . Will you integrate it in-code via PR? Many thanks for the help!

bmp-collector              | I0708 06:53:15.723652       1 common-header.go:25] BMP CommonHeader Raw: [ 0x03, 0x00, 0x00, 0x00, 0x73, 0x00 ]
bmp-collector              | I0708 06:53:15.723756       1 common-header.go:25] BMP CommonHeader Raw: [ 0x03, 0x00, 0x00, 0x00, 0x73, 0x00 ]
bmp-collector              | I0708 06:53:15.723787       1 per-peer-header.go:73] BMP Per Peer Header Raw: [ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x0A, 0x98, 0xB7, 0x0B, 0x00, 0x00, 0xFD, 0xE9, 0x0A, 0x98, 0xB7, 0x0B, 0x60, 0xE6, 0xA0, 0xDB, 0x00, 0x00, 0x00, 0x00 ]
bmp-collector              | I0708 06:53:15.723825       1 route-monitor.go:19] BMP Route Monitor Message Raw: [ 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x00, 0x43, 0x02, 0x00, 0x00, 0x00, 0x2C, 0x40, 0x01, 0x01, 0x02, 0x40, 0x02, 0x0A, 0x02, 0x02, 0x00, 0x00, 0xFD, 0xE9, 0x00, 0x00, 0xFD, 0xEB, 0x80, 0x0E, 0x18, 0x00, 0x02, 0x01, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0x0A, 0x98, 0xB7, 0x0B, 0x00, 0x10, 0x20, 0x01 ] length: 67
bmp-collector              | I0708 06:53:15.723860       1 bgp-update.go:136] BGPUpdate Raw: [ 0x00, 0x00, 0x00, 0x2C, 0x40, 0x01, 0x01, 0x02, 0x40, 0x02, 0x0A, 0x02, 0x02, 0x00, 0x00, 0xFD, 0xE9, 0x00, 0x00, 0xFD, 0xEB, 0x80, 0x0E, 0x18, 0x00, 0x02, 0x01, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0x0A, 0x98, 0xB7, 0x0B, 0x00, 0x10, 0x20, 0x01 ]
bmp-collector              | I0708 06:53:15.723878       1 route.go:29] Routes Raw: []
bmp-collector              | I0708 06:53:15.723912       1 bgp-path-attribute.go:21] BGPPathAttributes Raw: [ 0x40, 0x01, 0x01, 0x02, 0x40, 0x02, 0x0A, 0x02, 0x02, 0x00, 0x00, 0xFD, 0xE9, 0x00, 0x00, 0xFD, 0xEB, 0x80, 0x0E, 0x18, 0x00, 0x02, 0x01, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0x0A, 0x98, 0xB7, 0x0B, 0x00, 0x10, 0x20, 0x01 ]
bmp-collector              | I0708 06:53:15.723951       1 bgp-base-attributes.go:50] UnmarshalBGPBaseAttributes RAW: [ 0x40, 0x01, 0x01, 0x02, 0x40, 0x02, 0x0A, 0x02, 0x02, 0x00, 0x00, 0xFD, 0xE9, 0x00, 0x00, 0xFD, 0xEB, 0x80, 0x0E, 0x18, 0x00, 0x02, 0x01, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0x0A, 0x98, 0xB7, 0x0B, 0x00, 0x10, 0x20, 0x01 ]
bmp-collector              | I0708 06:53:15.724009       1 route.go:29] Routes Raw: []
bmp-collector              | I0708 06:53:15.724119       1 mp-reach-nlri.go:179] MPReachNLRI Raw: [ 0x00, 0x02, 0x01, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0x0A, 0x98, 0xB7, 0x0B, 0x00, 0x10, 0x20, 0x01 ]
bmp-collector              | I0708 06:53:15.724182       1 unicast.go:16] MP Unicast NLRI Raw: [ 0x10, 0x20, 0x01 ]
bmp-collector              | gobmp: 06:53:15.724352 {MsgType:76 MsgHash:a84c2cf187498ba4cff7b28bd6e928ea Msg:{"action":"add","router_hash":"a84c2cf187498ba4cff7b28bd6e928ea","router_ip":"10.152.183.12","base_attrs":{"base_attr_hash":"3b87061fdf773278959113c6f010f24c","origin":"incomplete","as_path":[65001,65003],"as_path_count":2,"is_atomic_agg":false},"peer_hash":"ebbe4bfa2301281a9ef5c720528b1be6","peer_ip":"10.152.183.11","peer_asn":65001,"timestamp":"Jul  8 06:53:15.000000","prefix":"2001::","prefix_len":16,"is_ipv4":false,"origin_as":65003,"nexthop":"10.152.183.11","is_nexthop_ipv4":false,"is_prepolicy":false,"is_adj_rib_in":false}}
bmp-collector              | I0708 06:53:15.724671       1 route-monitor.go:97] message of type: 76 json: {"action":"add","router_hash":"a84c2cf187498ba4cff7b28bd6e928ea","router_ip":"10.152.183.12","base_attrs":{"base_attr_hash":"3b87061fdf773278959113c6f010f24c","origin":"incomplete","as_path":[65001,65003],"as_path_count":2,"is_atomic_agg":false},"peer_hash":"ebbe4bfa2301281a9ef5c720528b1be6","peer_ip":"10.152.183.11","peer_asn":65001,"timestamp":"Jul  8 06:53:15.000000","prefix":"2001::","prefix_len":16,"is_ipv4":false,"origin_as":65003,"nexthop":"10.152.183.11","is_nexthop_ipv4":false,"is_prepolicy":false,"is_adj_rib_in":false}