Use the Kubernetes seed provider for Cassandra

wallrj commented 6 years ago

The Kubernetes documentation says that the Cassandra example uses a custom Kubernetes Seed Provider.

https://kubernetes.io/docs/tutorials/stateful-application/cassandra/

But it doesn't:

https://github.com/kubernetes/examples/issues/147

We should understand and test the Kubernetes seed provider and maybe use it by default:

/feature

munnerz commented 6 years ago

In Elasticsearch, we managed to avoid using the custom Kubernetes discovery plugin by using DNS SRV records to discover peers (i.e. creating a headless service).

I'd be interested to see if we can use a similar technique here? Not that I am against using it, but less dependencies the better 😄

kragniz commented 6 years ago

I'm trying to enable the seed provider, testing via running:

docker run -ti -e CASSANDRA_SEED_PROVIDER=io.k8s.cassandra.KubernetesSeedProvider gcr.io/google-samples/cassandra:v12

Which causes an exception to be raised and caught here: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/config/DatabaseDescriptor.java#L870-L879

Then annoyingly, it raises a new exception with a different stacktrace, making it hard to find what the original error was:

Fatal configuration error; unable to start server.  See log for stacktrace.     
org.apache.cassandra.exceptions.ConfigurationException: io.k8s.cassandra.KubernetesSeedProvider                                                                  
Fatal configuration error; unable to start server.  See log for stacktrace.     
        at org.apache.cassandra.config.DatabaseDescriptor.applyConfig(DatabaseDescriptor.java:782)                                                               
        at org.apache.cassandra.config.DatabaseDescriptor.<clinit>(DatabaseDescriptor.java:125)                                                                  
        at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:576)                                                                       
        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:730)                                                                           
ERROR 16:18:26 Exception encountered during startup                             
org.apache.cassandra.exceptions.ConfigurationException: io.k8s.cassandra.KubernetesSeedProvider                                                                  
Fatal configuration error; unable to start server.  See log for stacktrace.     
        at org.apache.cassandra.config.DatabaseDescriptor.applyConfig(DatabaseDescriptor.java:782) ~[apache-cassandra-3.9.jar:3.9]                               
        at org.apache.cassandra.config.DatabaseDescriptor.<clinit>(DatabaseDescriptor.java:125) ~[apache-cassandra-3.9.jar:3.9]                                  
        at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:576) [apache-cassandra-3.9.jar:3.9]                                        
        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:730) [apache-cassandra-3.9.jar:3.9]

munnerz commented 6 years ago

Could this be down to the seed provider not being installed in that sample container? I imagine it'll need to be added in somehow.

kragniz commented 6 years ago

it's added to the container as /kubernetes-cassandra.jar, and run.sh sets the classpath to point at it:

export CLASSPATH=/kubernetes-cassandra.jar

munnerz commented 6 years ago

Looking at this example, it appears we can also configure seeds through environment variables - I'm not sure if these can be changed at runtime through some API, but if so, Pilot could automatically configure the seeds dynamically.

https://github.com/vyshane/cassandra-kubernetes/blob/master/image/custom-entrypoint.sh

munnerz commented 6 years ago

👍 re being dropped into the image already (out-of-order messages)

kragniz commented 6 years ago

I don't think they can be updated during runtime.

There seems to be some confusion about seed nodes, I came across this thread that lists some problems people have with the seed nodes list: https://issues.apache.org/jira/browse/CASSANDRA-5836

wallrj commented 6 years ago

See also https://github.com/kubernetes/kubernetes/issues/24286 and https://github.com/kubernetes/kubernetes/issues/27239 for discussion of Kubernetes Seed provider.

Perhaps we can help with the effort to create a better (tested) kubernetes seed provider.

wallrj commented 6 years ago

And discussion of whether to put the Kubernetes seed provider code in a separate repo: https://github.com/kubernetes/kubernetes/issues/37408

kragniz commented 6 years ago

I got io.k8s.cassandra.KubernetesSeedProvider to load locally (not in a container) while watching it under a debugger, so it seems like it's a problem with the configuration in that container rather than the seed provider itself.

Also of note: the seed provider will not work with cassandra 4.0 (the SeedProvider interface changes from List<InetAddress> to List<InetAddressAndPort>).

kragniz commented 6 years ago

Working with a local build of kubernetes-cassandra.jar and the upstream docker library/cassandra:

$ docker run \
    -v (pwd)/kubernetes-cassandra.jar:/k8s/kubernetes-cassandra.jar \
    library/cassandra \
    sh -c 'sed -ri "s/org.apache.cassandra.locator.SimpleSeedProvider/io.k8s.cassandra.KubernetesSeedProvider/" \
    $CASSANDRA_CONFIG/cassandra.yaml && \
    CLASSPATH=/k8s/kubernetes-cassandra.jar exec /docker-entrypoint.sh cassandra -f'

Vince-Cercury commented 6 years ago

I'm also interested in the Seed Provider. I'm using the Cassandra for Kubernetes example and trying to make it production ready in AWS. I came across your project.

I'm trying to figure out:

Seed (how do we define how many from the example statefulset)
what happens without the KubernetesSeedProvider. What role it plays.

I'm also interested in Snitch for Kubernetes that is AZ aware (which I understand would become RACK in Cassandra terminology)

I can help with testing or anything else, let me know.

kragniz commented 6 years ago

@VinceMD we're going to define which nodes are seeds via labeling the pods with an external controller, and creating a headless service that selects on that label.

Without KubernetesSeedProvider, you'll need to have another way of telling cassandra about which of your nodes are seed nodes. Other cassandra examples with kubernetes seem to mark every node as a seed, which might be okay for small clusters, but it makes the gossip protocol inefficient.

We're planning on using GossipingPropertyFileSnitch for the snitch, setting the datacenter and rack before cassandra starts up (based on where the pod gets scheduled to). We might create a custom kubernetes snitch at some point in the future, but GossipingPropertyFileSnitch seems to fit our needs for now.

Vince-Cercury commented 6 years ago

Thanks. I've made some progress. Sharing in case any of this is useful to you guys, because I'd like to see one way of setting up cassandra that is production ready and supported by the community.

For the Snitch, since we run Kubernetes on AWS, I've started using the the EC2Snitch, but it is of course only for AWS.

About the seeds: The Cassandra on Kubernetes example defines cassandra-0 as the only seed node (which is not enough). The cassandra.yaml SEED PROVIDER is replaced at runtime by the value set in environment variable in statefulset (https://github.com/kubernetes/examples/blob/master/cassandra/cassandra-statefulset.yaml). It takes a comma separated list.

To avoid issue with order of nodes starting and being bound to Availability Zones (AZ) due to usage of EBS (hard to explain here, see my message here https://stackoverflow.com/questions/48698454/kubernetes-stateful-set-az-and-volume-claims-what-happens-when-an-az-fails), I've split the statefulset into 3:

statefuset-2a
statefulset-2b
statefulset-2c

I define 1 seed node per AZ which is more than enough and not too much:

- name: CASSANDRA_SEEDS value: "cassandra-2a-0.cassandra.cass-ha.svc.cluster.local, cassandra-2b-0.cassandra.cass-ha.svc.cluster.local, cassandra-2c-0.cassandra.cass-ha.svc.cluster.local"

Where "cass-ha" is the name of my namespace.

The trick is you must let the cassandra-2a-0 node fully boot up before creating the other 2 statefulsets. This is to avoid split brain of the cluster. It used to be that Cassandra does not like 2 nodes being added at the same time, but from my reading of the issues, this seems to have been solved. I have not had issues. So once the cassandra-2a-0 node is fully started, there is no problem letting the other nodes from boot and join the ring in parallel.

I've also got replication-factor=3 so I get a full copy of the data in each AZ (Rack). I've done some tests and I'm able to simulate bring an entire AZ down and the cluster is fine and data consistent. I use LOCAL_QUORUM for read and write consistency which means it needs 2 copies of the data available (3/2)+1

fmehrdad commented 5 years ago

Any one has been able to get this to work? I get Fatal configuration error; unable to start server. See log for stacktrace. org.apache.cassandra.exceptions.ConfigurationException: io.k8s.cassandra.KubernetesSeedProvider Fatal configuration error; unable to start server. See log for stacktrace. at org.apache.cassandra.config.DatabaseDescriptor.applySeedProvider(DatabaseDescriptor.java:895) at org.apache.cassandra.config.DatabaseDescriptor.applyAll(DatabaseDescriptor.java:324) at org.apache.cassandra.config.DatabaseDescriptor.daemonInitialization(DatabaseDescriptor.java:142) at org.apache.cassandra.service.CassandraDaemon.applyConfig(CassandraDaemon.java:647) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:582) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:691) ERROR 23:50:42 Exception encountered during startup org.apache.cassandra.exceptions.ConfigurationException: io.k8s.cassandra.KubernetesSeedProvider Fatal configuration error; unable to start server. See log for stacktrace. at org.apache.cassandra.config.DatabaseDescriptor.applySeedProvider(DatabaseDescriptor.java:895) ~[apache-cassandra-3.11.2.jar:3.11.2] at org.apache.cassandra.config.DatabaseDescriptor.applyAll(DatabaseDescriptor.java:324) ~[apache-cassandra-3.11.2.jar:3.11.2] at org.apache.cassandra.config.DatabaseDescriptor.daemonInitialization(DatabaseDescriptor.java:142) ~[apache-cassandra-3.11.2.jar:3.11.2] at org.apache.cassandra.service.CassandraDaemon.applyConfig(CassandraDaemon.java:647) [apache-cassandra-3.11.2.jar:3.11.2] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:582) [apache-cassandra-3.11.2.jar:3.11.2] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:691) [apache-cassandra-3.11.2.jar:3.11.2]

jetstack / navigator

Use the Kubernetes seed provider for Cassandra #223