AlexPikalov / cdrs

Cassandra DB native client written in Rust language. Find 1.x versions on https://github.com/AlexPikalov/cdrs/tree/v.1.x Looking for an async version? - Check WIP https://github.com/AlexPikalov/cdrs-async
Apache License 2.0
343 stars 58 forks source link

Create connection from a string #67

Open Keats opened 7 years ago

Keats commented 7 years ago

As mentioned on Reddit, I'm planning to add support for migrations for Cassandra in https://github.com/Keats/dbmigrate

Would it be possible to add a method to cdrs to get a connection from a string like cassandra://username:password@host:port/keyspace?

AlexPikalov commented 7 years ago

@Keats It seems it makes sense to provide an option as a method of CDRS non-ssl mode. Non-SSL because in order to connect apart of addr and creds we need also certificate. Also obviously it should use password authenticator.

So, it might look like:

impl CDRS {
    #[cfg(not(feature="ssl"))]
    fn from_string(connection_string: &string)
}

USAGE:

CDRS::from_string("cassandra://username:password@host:port/keyspace")

There is one think here I don't understand at the moment it's how this keyspace will work. From one hand, to apply this space to each query by default we'll need to keep it somewhere inside CDRS. From other hand, it can be set from:

To have this feature (keyspace) we may want to have some smart merge strategy for that.

Until it is clear we could provide from_string that could accept a connection string without keyspace cassandra://username:password@host:port

ernestas-poskus commented 7 years ago

It would be also cool if one could select keyspace in advance so it would not repeat in future queries.

e.g.: select * from keyspace_rembered.table ..

AlexPikalov commented 7 years ago

@ernestas-poskus It's already possible via set keyspace query.

https://github.com/AlexPikalov/cdrs#use-query

http://docs.datastax.com/en/cql/3.1/cql/cql_reference/use_r.html

Keats commented 7 years ago

Any update on that? I don't know about the keyspace itself as I've never used Cassandra myself, it seems similar to a schema in SQL databases? I was just looking at https://github.com/mattes/migrate/tree/master/driver/cassandra as a reference, maybe the Golang driver can be used to see how they implement it?

harrydevnull commented 7 years ago

@Keats : cassandra://username:password@host:port assumes that we have only one host; cassandra in production would have an array of hosts. It is a ring topology. At any point in time a node can die or a node would be added into the cluster. the client would ( not supported by cdrs currently) poll the cluster periodically to see which nodes are up/down and add/remove nodes ip from it's memory !!!

Say we have a 4 node cluster (10.10.10.10, 10.10.10.11, 10.10.10.12, 10.10.10.13) with a replication factor of 2. so 10.10.10.10 data would be copied on 10.10.10.11 and similarly 10.10.10.12 and 10.10.10.13 are pairs. we have provided cassandra://cdrs:cdrpassword@10.10.10.10:9042 as intial cassandra connection string there is no real guarantee that 10.10.10.10 would be alive for forever; but since the data on 10.10.10.10 's copy is on 10.10.10.11 cassandra server would serve the data out of 11 and the client application using the driver shouldn't be worried about this fact as the driver would abstract this transition behind the scenes.

I know I have gone into a totally different tangent with my explanation; but does this explanation make any sense?

Keats commented 7 years ago

I see, and I guess you will need to have the cluster to agree on a schema (I guess?) so you probably need to have special handling to wait till the that happens before running the next one (in my case of a cli to run schema migration). Thanks for the explanation

harrydevnull commented 7 years ago

yes precisely !!! providing an array of hosts (I deal with cassandra cluster with 50 nodes) in a string seems to be less ergonomic.

Fun fact

on a side note every node knows about all other nodes; so mentioning a single node would be enough; to execute cql statements. there I contradicted myself !!! Developers who wants to run application in production should not have a single node configuration; period ! if it a tool which run once in a while; if it fails we can change the host and retry. I don't know how could we incorporate these 2 orthogonal features into the same library.

cmsd2 commented 4 years ago

doesn't the driver discover the other nodes in the ring from the initial node(s) it connects to?

AlexPikalov commented 4 years ago

@cmsd2 Not yet, so far there is an option in excluding nodes which went down basing on server events