Closed bobertrublik closed 2 years ago
@bobertrublik
This address is the internal address of the Kubernetes cluster. It will be not reachable from your local machine (if I understand Now I see info [ root@curl-test:/ ]$
well)a pod in the same namespace running curl
. Check further hints:
Check if k8spacket is running:
kubectl -n k8spacket get pods
Check if k8spacket service exists:
kubectl -n k8spacket get svc
If not, please share here logs from one of the k8spacket pods.
Additionally, share the screen of Node Graph API plugin configuration (https://grafana.com/grafana/plugins/hamedkarbasi93-nodegraphapi-datasource/) from your Grafana instance.
I started a pod with curl which runs next to k8spacket pods in the same namespace, so it should have no problems accessing k8spacket.
Pods:
NAME READY STATUS RESTARTS AGE
curl-test 1/1 Running 0 2m5s
k8spacket-5fchh 1/1 Running 0 2m10s
k8spacket-627t9 1/1 Running 0 2m10s
k8spacket-stkv6 1/1 Running 0 2m10s
Service:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
k8spacket ClusterIP 172.16.56.82 <none> 8080/TCP 2m29s
The logs until k8spacket starts capturing packets:
2022/08/11 12:51:05 Serving requests on port 8080
2022/08/11 12:51:05 Refreshing interfaces for capturing...
2022/08/11 12:51:05 Starting capture on interface "cali2e0637c950b"
2022/08/11 12:51:05 Starting capture on interface "cali1f3b59b0059"
2022/08/11 12:51:05 Starting capture on interface "calid16cb36db4b"
2022/08/11 12:51:05 Starting capture on interface "calic03726a14ca"
2022/08/11 12:51:05 Starting capture on interface "cali528d52fdb4c"
2022/08/11 12:51:05 Starting capture on interface "cali1223e6c6c31"
2022/08/11 12:51:05 Starting capture on interface "cali433d7080645"
2022/08/11 12:51:05 Starting capture on interface "calie2e9896354d"
2022/08/11 12:51:05 Starting capture on interface "cali64429976fc5"
2022/08/11 12:51:05 Starting capture on interface "cali4f2d01b045d"
2022/08/11 12:51:05 Starting capture on interface "cali00c20cd856b"
2022/08/11 12:51:05 Starting capture on interface "calic8587a62da4"
2022/08/11 12:51:05 Starting capture on interface "calibc9f0bd94b0"
2022/08/11 12:51:05 Starting capture on interface "cali52b87dc4ff7"
2022/08/11 12:51:05 Starting capture on interface "cali73a037b6acb"
2022/08/11 12:51:05 Starting capture on interface "cali85ccd12a9b9"
2022/08/11 12:51:05 Starting capture on interface "cali36b3aeb53ab"
2022/08/11 12:51:05 Starting capture on interface "tunl0"
2022/08/11 12:51:05 Starting capture on interface "calif64e29ba28d"
2022/08/11 12:51:05 Starting capture on interface "cali5c3ecdedc6f"
2022/08/11 12:51:05 Starting capture on interface "cali599a5c75980"
2022/08/11 12:51:05 Starting capture on interface "cali92216366d5a"
2022/08/11 12:51:05 Starting capture on interface "cali95a3d8a7835"
2022/08/11 12:51:05 Starting capture on interface "cali65896cb1ab0"
2022/08/11 12:51:05 Starting capture on interface "cali83581ffecf2"
Ok if I run the curl command a few times in a row cancelling when nothing happens it reaches the pod about 1 out of 5 times? Really strange.
[ root@curl-test:/ ]$ curl http://k8spacket.k8spacket.svc.cluster.local:8080/metrics
^C
[ root@curl-test:/ ]$ curl http://k8spacket.k8spacket.svc.cluster.local:8080/metrics
^C
[ root@curl-test:/ ]$ curl http://k8spacket.k8spacket.svc.cluster.local:8080/metrics
curl: (7) Failed to connect to k8spacket.k8spacket.svc.cluster.local port 8080: Connection timed out
[ root@curl-test:/ ]$ curl http://k8spacket.k8spacket.svc.cluster.local:8080/metrics
^C
[ root@curl-test:/ ]$ curl http://k8spacket.k8spacket.svc.cluster.local:8080/metrics
^C
[ root@curl-test:/ ]$ curl http://k8spacket.k8spacket.svc.cluster.local:8080/metrics
# HELP go_build_info Build information about the main Go module.
# TYPE go_build_info gauge
go_build_info{checksum="",path="github.com/k8spacket",version="(devel)"} 1
# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
# TYPE go_gc_duration_seconds summary
@bobertrublik I'm wonder about this interface:
2022/08/11 12:51:05 Starting capture on interface "tunl0"
And suppose it could be a problem
In values.yaml
for helm there is a property command: "ip address | grep @ | sed -E 's/.* (\\w+)@.*/\\1/' | tr '\\n' ',' | sed 's/.$//'"
Change it to:
command: "ip address | grep @if | sed -E 's/.* (\\w+)@if.*/\\1/' | tr '\\n' ',' | sed 's/.$//'"
And reinstall k8spacket.
Or you can do it directly in daemonset as well:
kubectl -n k8spacket edit daemonsets.apps k8spacket
and then:
- name: K8S_PACKET_TCP_LISTENER_INTERFACES_COMMAND
value: ip address | grep @if | sed -E 's/.* (\w+)@if.*/\1/' | tr '\n' ',' | sed 's/.$//'
I changed it but didn't get noticably better :(
Environment:
K8S_PACKET_NAME_LABEL_VALUE: k8spacket
K8S_PACKET_HIDE_SRC_PORT: true
K8S_PACKET_REVERSE_GEOIP2_DB_PATH: /home/k8spacket/GeoLite2-City.mmdb
K8S_PACKET_REVERSE_WHOIS_REGEXP: (?:OrgName:|org-name:)\s*(.*)
K8S_PACKET_TCP_ASSEMBLER_MAX_PAGES_PER_CONN: 50
K8S_PACKET_TCP_ASSEMBLER_MAX_PAGES_TOTAL: 50
K8S_PACKET_TCP_ASSEMBLER_FLUSHING_PERIOD: 10s
K8S_PACKET_TCP_ASSEMBLER_FLUSHING_CLOSE_OLDER_THAN: 20s
K8S_PACKET_TCP_LISTENER_INTERFACES_COMMAND: ip address | grep @if | sed -E 's/.* (\w+)@if.*/\1/' | tr '\n' ',' | sed 's/.$//'
K8S_PACKET_TCP_LISTENER_INTERFACES_REFRESH_PERIOD: 10s
Also in Grafana when testing the datasource URL it fails most of the time.
The logs:
2022/08/11 14:03:13 Serving requests on port 8080
2022/08/11 14:03:13 Refreshing interfaces for capturing...
2022/08/11 14:03:13 Starting capture on interface "cali7ea0577d3f1"
2022/08/11 14:03:13 Starting capture on interface "calie813d30afc5"
2022/08/11 14:03:13 Starting capture on interface "calicf49bb793df"
2022/08/11 14:03:13 Starting capture on interface "cali6e53a9a258f"
2022/08/11 14:03:13 Starting capture on interface "calif7fccfaa4cc"
2022/08/11 14:03:13 Starting capture on interface "cali434dd2a1f75"
2022/08/11 14:03:13 Starting capture on interface "cali3df33f30e41"
2022/08/11 14:03:13 Starting capture on interface "cali2543f5110bf"
2022/08/11 14:03:13 Starting capture on interface "cali2afdef92a43"
2022/08/11 14:03:13 Starting capture on interface "calif9c78f37416"
2022/08/11 14:03:13 Starting capture on interface "cali401e21c8aeb"
2022/08/11 14:03:13 Starting capture on interface "cali00f0a380d6a"
2022/08/11 14:03:13 Starting capture on interface "cali4026fc5579a"
2022/08/11 14:03:13 Starting capture on interface "caliae18b86a1f9"
2022/08/11 14:03:13 Starting capture on interface "cali0af45212d8c"
2022/08/11 14:03:13 Starting capture on interface "cali4fcdb447e21"
2022/08/11 14:03:13 Starting capture on interface "caliba02546de21"
2022/08/11 14:03:13 Starting capture on interface "cali3f610232439"
2022/08/11 14:03:13 Starting capture on interface "cali3b7522ed88b"
@bobertrublik Could you check CPU
usage of k8spacket pods? (e.g., https://github.com/robscott/kube-capacity)
Additionally what I can suggest is to try to listen on one network interface first and check If there is some performance issue.
f.e. (check current interfaces first. In the example below I took it from your comment)
K8S_PACKET_TCP_LISTENER_INTERFACES_COMMAND: echo cali7ea0577d3f1
Resource usage looks good
k8spacket-9pft8 250m (3%) 500m (6%) 1000Mi (1%) 1500Mi (2%)
k8spacket-bswbl 250m (3%) 500m (6%) 1000Mi (1%) 1500Mi (2%)
k8spacket-kkflx 250m (3%) 500m (6%) 1000Mi (1%) 1500Mi (2%)
Interestingly after I set
K8S_PACKET_TCP_LISTENER_INTERFACES_COMMAND: echo caliceb690ae662 | tr -d '\n'
curl basically worked 99% of the time. How to interpret this?
Bandwith metrics:
Currently I got the logs and metrics graphs working, I guess the metrics can be read even if the connection times out from time to time. The node graph still times out with the following error.
status:504
statusText:""
data:Object
message:""
error:""
response:""
config:Object
method:"GET"
url:"api/datasources/proxy/3/nodegraphds/api/graph/data?namespace=&include=&exclude=&stats-type=connection"
retry:0
headers:Object
hideFromInspector:false
message:"Query error: 504 "
@bobertrublik this
k8spacket-9pft8 250m (3%) 500m (6%) 1000Mi (1%) 1500Mi (2%)
k8spacket-bswbl 250m (3%) 500m (6%) 1000Mi (1%) 1500Mi (2%)
k8spacket-kkflx 250m (3%) 500m (6%) 1000Mi (1%) 1500Mi (2%)
is a request and limit definition only. Please, use kube-capacity
tool to see current CPU and memory usage. Or you can find it in Grafana dashboards.
It's hard to investigate the problem without seeing it. If there is a possibility to prepare another k8s cluster with access, I could see it and find remedy.
Right sorry, here we go:
kube-capacity --pods -u | grep k8spacket
POD CPU REQUESTS CPU LIMITS CPU UTIL MEMORY REQUESTS MEMORY LIMITS MEMORY UTIL
k8spacket-w6dzp 250m (3%) 500m (6%) 19m (0%) 1000Mi (1%) 1500Mi (2%) 157Mi (0%)
k8spacket-f28pn 250m (3%) 500m (6%) 25m (0%) 1000Mi (1%) 1500Mi (2%) 113Mi (0%)
k8spacket-ts6hw 250m (3%) 500m (6%) 17m (0%) 1000Mi (1%) 1500Mi (2%) 132Mi (0%)
I'll try to investigate if the problem is because of Calico. Anyway, thank you very much for your help!
Last chance:
curl http://k8spacket.k8spacket.svc.cluster.local:8080/api/graph/data
Yes the curls appear in the logs
Looking at the code I curled the endpoint http://%s:8080/connections?%s
and had the same recurring problem with the call timing out most of the time.
I noticed that when the connection times out the logs have a value of 0 for bytesReceived and bytesSent.
Hey @k8spacket I think I found out the issue. I noticed that my Calico interfaces have an MTU of 1440 because 60 bytes are used for the header. (https://projectcalico.docs.tigera.io/networking/mtu#determine-mtu-size)
Meanwhile the eth0 interface has an MTU of 1500 because the daemonset sets hostNetwork: true and there it apparently uses an MTU of 1500. Do you know if there is an application side fix? Otherwise the issue can be closed :)
@bobertrublik As far as I see there is an option to change MTU for calico network interfaces. I prepared my cluster to have various MTU for eth0 and calico interfaces, but still, no luck repeating your problem. Did you manage it somehow?
Hello,
I'm trying to access the metrics of the k8spacket pods. However none of them are reachable by Grafana or a pod in the same namespace running curl.
Same for any other endpoints defined in k8spacket.go.