Closed dongmx closed 7 years ago
No, it's not normal; traffic should go directly to the host.
Can you confirm whether you've got the Calico node-to-node mesh running, and the status of the BGP sessions to the switch from the other nodes (not 10.10.25.8)?
# ./calicoctl-v0.19.0 status
calico-node container is running. Status: Up 12 weeks
Running felix version 1.4.0rc1
IPv4 BGP status
IP: 10.10.28.16 AS Number: 64511 (inherited)
+--------------+-------------------+-------+----------+-------------+
| Peer address | Peer type | State | Since | Info |
+--------------+-------------------+-------+----------+-------------+
| 10.10.37.254 | global | up | 08:52:58 | Established |
| 10.10.26.253 | global | up | 08:53:01 | Established |
| 10.10.25.8 | node-to-node mesh | up | 09:00:53 | Established |
| 10.10.26.4 | node-to-node mesh | up | 09:00:54 | Established |
| 10.10.27.17 | node-to-node mesh | up | 09:38:36 | Established |
| 10.10.29.1 | node-to-node mesh | up | 09:00:54 | Established |
| 10.10.29.2 | node-to-node mesh | up | 09:00:54 | Established |
| 10.10.29.7 | node-to-node mesh | up | 09:00:54 | Established |
| 10.10.29.8 | node-to-node mesh | up | 09:00:53 | Established |
| 10.10.36.4 | node-to-node mesh | up | 09:00:54 | Established |
| 10.10.37.23 | node-to-node mesh | up | 09:00:54 | Established |
| 10.10.37.4 | node-to-node mesh | up | 09:00:54 | Established |
+--------------+-------------------+-------+----------+-------------+
Every node to the switch state is up.
For example : 10.11.0.19
is running on 10.10.28.16
.
# traceroute 10.11.0.19
traceroute to 10.11.0.19 (10.11.0.19), 30 hops max, 60 byte packets
1 10.10.27.254 (10.10.27.254) 0.551 ms 0.345 ms 0.423 ms #This is default gateway
2 10.10.37.254 (10.10.37.254) 5.063 ms 5.193 ms 5.366 ms # This is the global peer switch
3 10.10.25.8 (10.10.25.8) 0.145 ms 0.183 ms 0.178 ms # This is the problem
4 10.10.28.16 (10.10.28.16) 0.138 ms 0.180 ms 0.176 ms
5 10.11.0.19 (10.11.0.19) 0.258 ms 0.218 ms 0.259 ms
I thought there is something wrong with my switch route selection. My switch is Dell force10 s55. Do you know how to config route selection mode about it ? How should I debug ?
#show ip bgp
BGP table version is 102, local router ID is 10.10.26.253
Status codes: s suppressed, S stale, d dampened, h history, * valid, > best
Path source: I - internal, a - aggregate, c - confed-external, r - redistributed
n - network, D - denied, S - stale
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 10.11.0.0/26 10.10.29.7 0 64511 i
* 10.10.37.23 0 64511 i
* 10.10.27.17 0 64511 i
* 10.10.29.2 0 64511 i
* 10.10.25.8 0 64511 i
* 10.10.26.4 0 64511 i
* 10.10.28.16 0 64511 i
* 10.10.37.4 0 64511 i
* 10.10.29.1 0 64511 i
* 10.10.29.8 0 64511 i
* 10.10.36.4 0 64511 i
*> 10.11.0.1/32 10.10.29.7 0 64511 i
* 10.10.37.23 0 64511 i
* 10.10.27.17 0 64511 i
* 10.10.29.2 0 64511 i
* 10.10.25.8 0 64511 i
* 10.10.26.4 0 64511 i
* 10.10.37.4 0 64511 i
* 10.10.29.1 0 64511 i
* 10.10.29.8 0 64511 i
* 10.10.36.4 0 64511 i
*> 10.11.0.11/32 10.10.29.7 0 64511 i
* 10.10.37.23 0 64511 i
* 10.10.27.17 0 64511 i
It seems that as-path is all the same, so the switch cannot route directly to the host. I wonder how bird find out which router is the host? @matthewdupre
@dongmx Are you still having problems with this? As a first pass I would start with suggesting you upgrade the version of calico you are using, as your calicoctl is quite old.
This problem is caused by my wrong BGP config. I should config switch and machines in a rack as a group. I will close this issue :)
I config many nodes in my switch which is a global peer.
And every node has several containers. But when I
tracetoute $CONTAINER_IP
, the switch always use the first neighbor10.10.25.8
as gateway. No matter container running on which node. It means all network traffic forwarded by 10.10.25.8.Is this normal? How can I make gateway to be certen physical machine which the container running on? Thank you!