apache / cassandra-gocql-driver

GoCQL Driver for Apache Cassandra®
https://cassandra.apache.org/
Apache License 2.0
2.57k stars 620 forks source link

Amazon Cassandra : Column family TableIdentifier(ksName=system, cfName=local) is not writable #1389

Closed nikhiljangi closed 4 years ago

nikhiljangi commented 4 years ago

Please answer these questions before submitting your issue. Thanks!

What version of Cassandra are you using?

Amazon Managed Cassandra - version 3.11.2

What version of Gocql are you using?

v0.0.0-20200103014340-68f928edb90a

What did you do?

Tried to connect to Amazon Managed Cassandra using gocql

clusterConfig := gocql.NewCluster("cassandra.us-east-1.amazonaws.com:9142")
clusterConfig.Authenticator = gocql.PasswordAuthenticator{Username: username, Password: password}
clusterConfig.SslOpts = &gocql.SslOptions{
        CaPath: "AmazonRootCA1.pem",
    }

clusterConfig.Keyspace = "test"
session, err := clusterConfig.CreateSession()

What did you expect to see?

Successful gocql Session creation

What did you see instead?

{"error": "gocql: unable to create session: control: unable to setup connection: Column family TableIdentifier(ksName=system, cfName=local) is not writable"}


If you are having connectivity related issues please share the following additional information

Describe your Cassandra cluster

please provide the following information

peer | rpc_address 3.83.170.140 | null 3.83.168.142 | null 3.83.169.141 | null 3.83.170.152 | null 3.83.168.154 | null 3.83.169.153 | null 3.83.168.143 | null 3.83.170.151 | null 3.83.171.143 | null

2020-01-13T16:53:55.664-0500 ERROR filter/filter.go:65 error: {"error": "gocql: unable to create session: control: unable to setup connection: Column family TableIdentifier(ksName=system, cfName=local) is not writable"}

t2y commented 4 years ago

I also had the same issue.

Zariel commented 4 years ago

This is odd, we don't write to system.local only read from it.

t2y commented 4 years ago

According to my debugging, when gocql create prepared statement and communicate with MCS, then it failed as below.

row, err := c.query(ctx, "SELECT * FROM system.local WHERE key='local'").rowMap()
=> info, err = c.prepareStatement(ctx, qry.stmt, qry.trace)

However, I'm not sure why creating prepare statement will cause "not writable" error.

Zariel commented 4 years ago

I think thats a bug in AWS, this works in Cassandra.

tschirmer commented 4 years ago

same issue here. I don't think this is a bug in AWS the system.local query works within the AWS Query editor

image

mattmassicotte commented 4 years ago

Running into this problem as well. And, made an interesting discovery on the AWS documentation site for MCS. Looks like they may have found this problem themselves during testing.

Currently, Go client drivers are not supported.

https://docs.aws.amazon.com/mcs/latest/devguide/cqlsh.html#using_driver

tschirmer commented 4 years ago

Yeah I see that, but they've probably tried to use this driver and then marked it as incompatible because of the problems. At the end of the day the driver is just preparing and communicating packets via TCP. If it works via python, and C#, C++. we should be able to make it work in golang . It'd be good to see why the prepare statement is trying to write to the system table.

dahankzter commented 4 years ago

Could it be the events registration on the controlConn @Zariel ? It seems to be the only thing possible at this point in the path.

dahankzter commented 4 years ago

The Java drivers control connection does this as well however so it seems unlikely.

dahankzter commented 4 years ago

Anyway perhaps the errors from the lower levels also should be decorated to pin exactly where it happens. The newer Go error facilities could perhaps be used to wrap/unwrap them.

dahankzter commented 4 years ago

The server rejects prepared statements to the system tables and the driver always prepares when it can. Thx @slivne for finding it and testing it. Seems to be a server side issue.

Zariel commented 4 years ago

Can someone try in Java/cpp doing a prepare + execute on MCS with the query SELECT * from system.local and see if it works?

If they are blocking creating prepared statements on system tables that is very strange. I don't have any contacts at AWS to talk about this.

dahankzter commented 4 years ago

That would be great but unless I am lost in the callback code of the Java driver it does indeed look like it doesn't prepare any of the maintenance queries to system.* tables.

dahankzter commented 4 years ago

The patch that made it work for for us was:

--- a/session.go
+++ b/session.go
@@ -1016,6 +1016,9 @@ func (q *Query) GetRoutingKey() ([]byte, error) {
 }

 func (q *Query) shouldPrepare() bool {
+        if strings.Contains(q.stmt, "system.") {
+           return false;
+        }

So it seems clear that there is something on the server stopping them.

tschirmer commented 4 years ago

@dahankzter That seems like a reasonable solution.

Prepared Statements make sense as a default, but for this instance turning them off for system calls might be our only option. We could always set it as an option in the config to turn off/on and document it for AWS Managed Cassandra?

dahankzter commented 4 years ago

I don't know if that's the way to go @tschirmer. If it's supported by Cassandra then AMC should also support it. Don't you think? This snippet was just something that was tried to diagnose the issue rather than as a proposed fix. What do you think @Zariel?

tschirmer commented 4 years ago

@dahankzter tl;dr My opinion is to add a config option to disable prepared statements for system tables; but default it to enabled, so we don't exclude managed / serverless technologies.

AWS have a couple managed services (Serverless MySQL for example) where they've locked down system variable changes so that they can create a service that's easily scalable. The cost of that, is that users can't change or tune things themselves, but the upside is they don't manage it.

I don't think we should be creating a driver that locks people out of serverless architecture because certain system values can't be written. In many cases where I've run SaaS infrastructure, we've had to lock out users from certain features so that we could easily manage it.

My opinion is that we make it an option that people can turn on or off; with the default of being on. Then add a comment that there are security risks if people start using it to write to system tables without bound variables (I'm thinking injection attacks).

dahankzter commented 4 years ago

I don't know the implementers motivations for this limitation @tschirmer. It would be nice if someone from the AMC team could chime in to explain. We are not explicitly writing to the system tables just preparing a query so the argument seems a little strange and a server side implementation detail.

It's up to @Zariel to decide if we should accommodate special cases for implementations other than Cassandra itself which really is the reference.

slivne commented 4 years ago

@tschirmer the issue is that the driver is not generating a write (CQL INSERT/UPDATE) the driver is generating a read (CQL SELECT). So while I do understand the value in blocking writes to system tables blocking reads done via prepared statements seems like a bug not a feature.

Zariel commented 4 years ago

I would rather not add the somewhat hacky check in shouldPrepare to work around this, or a config option, I'm also somewhat surprised that the driver continued to work. I would like to hear from AWS as to why, as @slivne says that creating a prepared statement on a system table counts as a write even though we only execute a read.

tschirmer commented 4 years ago

@slivne it'd be nice to get more info from either the Cassandra team themselves, or the AWS Team. I'd like to know if Cassandra is executing a write while doing a prepared statement in the background. I don't like hacky jobs either; and in principle it shouldn't be conducting a write.

That said, personally, I'd rather have a working driver with a rather non-impacting edge case exclusion than not be able to use a managed service; the business payoff is too large to use them (I also understand this is a security concern, so I totally understand the hesitation).

I'm meeting with our AWS rep in a couple weeks; I can bring it up. Maybe someone else has contacts they can bring into this convo?

gaurish commented 4 years ago

We spoke to AWS about this. they said:

We have reached out to the service team regarding this issue and would like to resolve this issue as soon as we can.

Also, would it be possible for you to provide us a stack trace? That error message is ours, but it's correct. A client cannot and should not modify system tables, so we really need to know where the driver is doing that.

will keep you guys updated with the the aws cassandra service team has to say.

danilop commented 4 years ago

Hi, I am from AWS! As @gaurish said, we are aware of the issue and working on a solution. We will update the GitHub issue when the fix has been deployed.

Zariel commented 4 years ago

Thanks @danilop!

sagarp-webonise commented 4 years ago

@danilop any update on issue fix? when can we expect the deployment of fix?

danilop commented 4 years ago

@sagarp-webonise Thank you for following up! I don’t have new information to share at the moment, the team is actively working on the driver support. We will update this issue when the driver support is available.

Dhaval08 commented 4 years ago

Meanwhile, is there an alternative way to access AWS MCS using golang? Is there some API that we could call in order to perform various operations?

danilop commented 4 years ago

We identified the cause of the warning messages customers were seeing when connecting to MCS using the gocql driver. We made a change that enables the gocql driver to connect to MCS without generating the warning messages about invalid peers. The change will apply to versions cd4b606 and newer of the gocql driver.

Please let us know if you see any further issues using the gocql driver with MCS.