Closed tflabs-nl closed 2 years ago
@ysksuzuki is there any way I can contact you so I can provide access to the cluster and yamls for easy debugging for you?
Hi, thank you for reporting the issue. Could you share the manifests you applied to your cluster?
Hi, thank you for your fast reply.
[ see my next reply for files ]
I only included the changes I did before building and the resulting yaml from the Coil build process. Also, I included my BIRD config file so you can see the import/export rules.
If you need anything else please let me know!
Please write your files directly here.
Ahh, will do. I can't upload yaml files, so I will upload the zip instead here...
As an organizational policy, I am not allowed to open attachments, so please paste the yaml contents directly.
the generated coil.yaml is too big to paste, so I need to skip that one.
default address pool:
apiVersion: coil.cybozu.com/v2
kind: AddressPool
metadata:
name: default
spec:
blockSizeBits: 5
subnets:
- ipv4: 10.100.0.0/16
kustomization used to generate coil yaml:
images:
- name: coil
newTag: 2.0.14
newName: ghcr.io/cybozu-go/coil
resources:
- config/default
# If you are using CKE (github.com/cybozu-go/cke) and wwant to use
# its webhook installation feature, comment the above line and
# uncomment the below line.
#- config/cke
# If you want to enable coil-router, uncomment the following line.
# Note that coil-router can work only for clusters where all the
# nodes are in a flat L2 network.
- config/pod/coil-router.yaml
# If your cluster has enabled PodSecurityPolicy, uncomment the
# following line.
#- config/default/pod_security_policy.yaml
patchesStrategicMerge:
# Uncomment the following if you want to run Coil with Calico network policy.
#- config/pod/compat_calico.yaml
# Edit netconf.json to customize CNI configurations
configMapGenerator:
- name: coil-config
namespace: system
files:
- cni_netconf=./netconf.json
# Adds namespace to all resources.
namespace: kube-system
# Labels to add to all resources and selectors.
commonLabels:
app.kubernetes.io/name: coil
netconf.json:
{
"cniVersion": "0.4.0",
"name": "k8s-pod-network",
"plugins": [
{
"type": "coil",
"socket": "/run/coild.sock"
},
{
"type": "bandwidth",
"capabilities": {
"bandwidth": true
}
},
{
"type": "portmap",
"capabilities": {
"portMappings": true
}
}
]
}
default egress:
apiVersion: coil.cybozu.com/v2
kind: Egress
metadata:
namespace: default
name: egress
spec:
replicas: 1
destinations:
- 10.100.0.0/16
Create webserver namespace, no annotiations; Then:
Create public facing IP pool:
apiVersion: coil.cybozu.com/v2
kind: AddressPool
metadata:
name: webserver
spec:
blockSizeBits: 0
subnets:
- ipv4: 185.222.22.22/32
Create webserver-internet namespace with annotation for created IP pool:
apiVersion: v1
kind: Namespace
metadata:
name: webserver-internet
annotations:
coil.cybozu.com/pool: webserver
Create webserver-internet egress:
apiVersion: coil.cybozu.com/v2
kind: Egress
metadata:
namespace: webserver-internet
name: nat
spec:
replicas: 1
destinations:
- 0.0.0.0/0
Could you tell me what you want to do? You created a Pod which has egress.coil.cybozu.com/webserver-internet: nat
annotation but it couldn't access the internet?
I created a deployment with multiple replica's in the default namespace. I expected these pods to be able to curl/ping eachother, but that doesn't seem to work. I did an apt update && apt install apache2 iputils-ping
to test the curl'ing.
Pod 1 got clusterIP address: 10.100.6.20 Pod 2 got clusterIP address: 10.100.6.2
Both run on the same node, in the same namespace.
So inter-pod communication does not seem to work, while I expected it to do so.
yaml here:
apiVersion: apps/v1
kind: Deployment
metadata:
name: ubuntu-debug-21-10
spec:
selector:
matchLabels:
management: management
replicas: 3
strategy:
type: RollingUpdate
template:
metadata:
annotations:
egress.coil.cybozu.com/webserver-internet: nat
egress.coil.cybozu.com/default: egress
labels:
management: management
spec:
containers:
- name: debugging
image: 'weibeld/ubuntu-networking' #ubuntu:21.10
command: [ "/bin/bash", "-c", "--" ]
args: ["while true; do sleep 30; done;"]
dnsPolicy: None
dnsConfig:
nameservers:
- 1.1.1.1
- 8.8.8.8
Can those Pods communicate each other without egress.coil.cybozu.com/webserver-internet: nat
and egress.coil.cybozu.com/default: egress
annotations? Why the Egress in default namespace is needed?
yes, that works!
I thought the egress in the default namespace was needed to make sure 10.100.0.0/16 is not routed outside of the cluster, as they would otherwise only have a 0.0.0.0/0 route via webserver-internet: nat?
Only including the egress.coil.cybozu.com/webserver-internet: nat
also works
I thought the egress in the default namespace was needed to make sure 10.100.0.0/16 is not routed outside of the cluster, as they would otherwise only have a 0.0.0.0/0 route via webserver-internet: nat?
Do you mean that you created the Egress in default namespace to avoid packets destined to 10.100.0.0/16 from being routed outside of the cluster? If so you don't need to do that. Coil allocates address blocks from the address pool 10.100.0.0/16 and publish the routing entry to each cluster node, and the cluster nodes aware of the Pod CIDR.
I indeed tried to avoid internal packets being forced over the internet. This makes sense! Thank you!
Also, do you have a donation link?
Describe the bug After creating a kubernetes cluster with the default service IP's and installing Coil as a CNI (no other CNI's) pods are not able to communicate directly.
Environments
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.3", GitCommit:"816c97ab8cff8a1c72eccca1026f7820e93e0d25", GitTreeState:"clean", BuildDate:"2022-01-25T21:25:17Z", GoVersion:"go1.17.6", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.3", GitCommit:"816c97ab8cff8a1c72eccca1026f7820e93e0d25", GitTreeState:"clean", BuildDate:"2022-01-25T21:19:12Z", GoVersion:"go1.17.6", Compiler:"gc", Platform:"linux/amd64"}
Linux HostnameHere 5.4.0-96-generic #109-Ubuntu SMP Wed Jan 12 16:49:16 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
To Reproduce Steps to reproduce the behavior:
Expected behavior Being able to access other pods in same (or other) namespace.
Additional context It doesn't matter if both pods are scheduled on the same node, traceroute makes it seem like traffic cannot be delivered to the pod. Example traceroute (simplified):
when CURL'ing the services' ClusterIP from the node itself, or even from another node, everything works as expected.
Is this a mis-configuration?