Open klamkma opened 11 months ago
Hi,
this is one of our weak spots. Either your pod IPs are routable outside of Kubernetes and then this is not a problem as the clients can access all Cassandra pods directly, or you have a connectivity problem due to how the driver behaves (with its auto discovery feature). I guess a workaround for now would be to write a custom address translator implementation which would turn any address from the cluster's CIDR range (or any address) into the external address of the nodeport or ingress. This way all traffic would be routed through the cluster entry point.
@olim7t, wdyt?
Reaper itself will run from within the k8s cluster and doesn't need to go through the nodeport.
Hi @adejanovski,
NodePort: When using:
networking:
nodePort:
internode: 30001
native: 30002
is not only using nodeport with target port 9042, but it changes the native port used by cassandra to 30002. Cassandra statefulset still has port 9042 set, but in reality 30002 is used. You can see it in the pod in /config/cassandra.yaml
storage_port: 30001
native_transport_port: 30002
All the kubernetes services are changed, and these ports are used:
- name: native
port: 30002
protocol: TCP
targetPort: 30002
- name: tls-native
port: 9142
protocol: TCP
targetPort: 9142
- name: mgmt-api
port: 8080
protocol: TCP
targetPort: 8080
- name: prometheus
port: 9103
protocol: TCP
targetPort: 9103
- name: metrics
port: 9000
protocol: TCP
targetPort: 9000
- name: thrift
port: 9160
protocol: TCP
targetPort: 9160
That's the reason why reaper does not work. The service used by reaper in my case is cassandra-dc1-service and it does not expose 9042 port when nodePort is set.
Another approach (without nodeport): I tried another approach to set:
perNodeConfigMapRef:
name: cassandra-node-config
apiVersion: v1
kind: ConfigMap
metadata:
name: cassandra-node-config
data:
cassandra-dc1-default-sts-0_cassandra.yaml: >
broadcast_rpc_address: x0.yy.org
cassandra-dc1-default-sts-1_cassandra.yaml: >
broadcast_rpc_address: x1.yy.org
I created SVC for each node of type LoadBalancer and external-dns annotation to get new external IPs generated for each node.
apiVersion: v1
kind: Service
metadata:
annotations:
external-dns.alpha.kubernetes.io/hostname: x0.yy.org
name: cassandra-dc1-default-sts-0
spec:
ports:
- name: cassandra
port: 9042
protocol: TCP
targetPort: 9042
selector:
cassandra.datastax.com/cluster: cassandra
cassandra.datastax.com/seed-node: 'true'
statefulset.kubernetes.io/pod-name: cassandra-dc1-default-sts-0
type: LoadBalancer
That setup works perfect as nodes are broadcasting external IPs, but cass-operator is not able to update the CassandraDatacenter status. It fails to find Host ID because it takes pod internal IP https://github.com/k8ssandra/cass-operator/blob/442470463baf8de891d84438064a68fa7ac8f072/pkg/reconciliation/reconcile_racks.go#L922
func getRpcAddress(dc *api.CassandraDatacenter, pod *corev1.Pod) string {
nc := dc.Spec.Networking
if nc != nil {
if nc.HostNetwork {
return pod.Status.HostIP
}
if nc.NodePort != nil {
if nc.NodePort.Internode > 0 ||
nc.NodePort.InternodeSSL > 0 {
return pod.Status.HostIP
}
}
}
return pod.Status.PodIP
}
and then it search for that IP in url=/api/v0/metadata/endpoints field RPC_ADDRESS https://github.com/k8ssandra/cass-operator/blob/442470463baf8de891d84438064a68fa7ac8f072/pkg/reconciliation/reconcile_racks.go#L970
nodeStatus.HostID = findHostIdForIpFromEndpointsData(
endpointsResponse.Entity, ip)
if nodeStatus.HostID == "" {
logger.Info("Failed to find host ID", "pod", pod.Name)
}
If findHostIdForIpFromEndpointsData would use INTERNAL_IP and not RPC_ADDRESS, that check would work. https://github.com/k8ssandra/cass-operator/blob/442470463baf8de891d84438064a68fa7ac8f072/pkg/reconciliation/reconcile_racks.go#L915
func findHostIdForIpFromEndpointsData(endpointsData []httphelper.EndpointState, ip string) string {
for _, data := range endpointsData {
if net.ParseIP(data.GetRpcAddress()).Equal(net.ParseIP(ip)) {
return data.HostID
}
}
return ""
}
Would it be possible to use INTERNAL_IP in findHostIdForIpFromEndpointsData instead of RPC_ADDRESS?
Thanks!
I have the exact same problem. Did you find another way to achieve what you wanted ? Another workaround ? I found that there is an env var that could help setting this up maybe: https://github.com/k8ssandra/cass-operator/blob/e45e7e9052ff90feb1f18adf5d0c493997943c6e/pkg/reconciliation/construct_podtemplatespec.go#L560 So having this value to true might make the ip set as the host ip, having a nodeport setup on a separate file would allow the ports to stay correct internally and the nodeport would allow external connection. Even then, I think the port would be broadcast to 9042 instead of the nodeport by the gossip protocol and that would still fail when trying to access from outside the cluster...
What did you do? I have cassandra client running outside of kubernetes cluster.
I've tried to use:
But then reaper keeps trying to connect on port 9042 and I do not see any way to set up the custom port for reaper.
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: cassandra-dc1-service/xx.xx.xx.xx:9042 Cannot connect)
Did you expect to see some different?
Environment
K8ssandra Operator version:
cr.k8ssandra.io/k8ssandra/k8ssandra-operator:v1.10.3
Kubernetes version information:v1.27
Kubernetes cluster kind:``` gcpManifests:
no relevant operator logs
Anything else we need to know?:
Is there any better way to expose cassandra to external clients (not in kubernetes)? If I just expose one service, then the nodes any way returns all the nodes IPs to the client and of course they give timeouts. If I use NodePort I see this in my python client:
Thanks!
┆Issue is synchronized with this Jira Story by Unito ┆Issue Number: K8OP-55