pchalamet / cassandra-sharp

high performance .NET driver for Apache Cassandra
114 stars 41 forks source link

Exception “Can't find any valid endpoint” #92

Closed dynameat closed 7 years ago

dynameat commented 7 years ago

Hi! When I execute sample client code as it is, I get an error on any operation after opening connection to cluster, such as this Execute command:

using (ICluster cluster = ClusterManager.GetCluster("Cassandra"))
{   
    ICqlCommand cmd = cluster.CreatePocoCommand();
    Task t = cmd.Execute("Select * from system_schema.keyspaces").AsFuture();
...}

When I start debugging this line, I see all my endpoints in view of tree ICluster -> SingleConnectionPerEndpointStrategy -> _endpointStrategy -> NearestEndpointStrategy -> bannedEndpoints (with list of my IP-adresses), and 0 endpoints are healthy.

Moreover, when I'm trying to connect to any of my hosts with cqlplus.exe utility, I get the same error. With /dbglog flag it shows messages like this:

  Creating connection to 192.168.1.3
    Readyfying connection for 192.168.1.3
    Starting writing frame for stream 127@192.168.1.3
    Done writing frame for stream 127@192.168.1.3
    Failed building connection System.AggregateException: One or more errors occurred. ---> System.OperationCanceledException: The operation was canceled.
    --- End of inner exception stack trace ---
    at System.Threading.Tasks.Task`1.GetResultCore(Boolean waitCompletionNotification)
    at CassandraSharp.Transport.LongRunningConnection.ReadifyConnection() in ...\LongRunningConnection.cs:line 444
    ... Error creating transport for endpoint 192.168.1.3 : One or more errors occurrd.
    ... - marking 192.168.1.3 for recovery
    Command execution failed with error: Can't find any valid endpoint

I use latest version of cassandra-sharp driver 3.7.0 and Cassandra version 3.11.0. My App.config file seems like ok. The code is ideal, so it may be wrong configuration of the cluster. Network is simple with one switch, everything pinged, all Cassandra ports are opened in Windows firewall on the client PC. I can add any needed additional information from the config file cassandra.yaml, f.e. I use PropertyFileSnitch, start_native_transport: true, rpc_interface and listen_interface are pointing to my actual network interface. If rpc_address and listen_address are set to 192.168.1.3 - the error is same. If rpc_address and listen_address are localhost - my Cassandra instance write error on startup (something about wrong seeds, I can add this if you need). In cqlsh console everything works fine, nodetool status is Up&Normal.

So is it something wrong with configuration of my cluster or with client driver, and how to solve it? Thank you for reply!

pchalamet commented 7 years ago

Hello, I've not updated this driver since long and not really sure it's compatible with C* 3... Give it a try with v2 to confirm. Also maybe switch to DataStax drivers - it's a better fit I guess now until I decide to work on this again!

dynameat commented 7 years ago

Okey, sorry for warning! Your driver is good enough and providing easiest way to work with C*. I will try something else, if there's no other case. Thanks for reactive reply!

pchalamet commented 7 years ago

Anyway this would be really interesting to investigate if you have some time - would assist if you need help for PR then.

dynameat commented 7 years ago

Let me check my cluster with other drivers, if it will be connected without errors - then actual version of c-sharp really doesn't work with C 3+. Or you want to find where is a problem of compatibility and make it up to date?

I don't think that I'll decide to use C 2, because I've already modeled my data using Materialized Views. But for your investigation I can install version 2.2.10 (c-sharp 100% works with it?) on one node and try to connect to it.

pchalamet commented 7 years ago

Both indeed :-) maybe it's not v3 and something else. If you just need help to investigate, let me know.

dynameat commented 7 years ago

Started C* version 2.2.10 on one node, and cqlplus.exe connected to it successfully. Debuging C# application I see 1 healthy endpoint in object cluster. So it is absolutely true problem with compatibility. I'm going to try Datastax drivers with my v3. And what do you need now to investigate the problem?

pchalamet commented 7 years ago

Thanks for testing this compatibility problem. Maybe they did deprecate the protocol used in v2. As of now, it's probably wiser to use Datastax driver instead if v3 is involved.

I will read the protocol in v3 just to try to understand what is going on. Thanks again.

dynameat commented 7 years ago

In the Cassandra: The Definitive Guide (2nd Edition) by Jeff Carpenter & Eben Hewitt said that basical engine has been reworked, so it may be very different from v2. cassandra v3 Anyway if you going to update your driver - please, tell me, maybe I'll find Datastax's not good enough and will be able to wait some time till you've done.

pchalamet commented 7 years ago

I guess protocole must be upgraded to 3.1 or 3.3 to be compatible. This is probably not a big deal. I will look at it.

http://docs.datastax.com/en/landing_page/doc/landing_page/compatibility.html?scroll=compatibilityDocument__cql-versions

pchalamet commented 7 years ago

Yes it's the protocol version that must upgraded.. ProtocolVersion is 1 for the moment. A bit late to the game so ;-)

dynameat commented 7 years ago

😄 So you will just change one module and everything will work fine?

pchalamet commented 7 years ago

Changes are as follow. It's not a big deal (data size change) and mostly optional. Some are more involved like AUTH changes. But it's largely doable imo.

  1. Changes from v2

    • stream id is now 2 bytes long (a [short] value), so the header is now 1 byte longer (9 bytes total).
    • BATCH messages now have (like QUERY and EXECUTE) and a corresponding optional parameters (see Section 4.1.7).
    • User Defined Types and tuple types have to added to ResultSet metadata (see 4.2.5.2) and a new section on the serialization format of UDT and tuple values has been added to the documentation (Section 7).
    • The serialization format for collection has changed (both the collection size and the length of each argument is now 4 bytes long). See Section 6.
    • QUERY, EXECUTE and BATCH messages can now optionally provide the default timestamp for the query. As this feature is optionally enabled by clients, implementing it is at the discretion of the client.
    • QUERY, EXECUTE and BATCH messages can now optionally provide the names for the values of the query. As this feature is optionally enabled by clients, implementing it is at the discretion of the client.
    • The format of "Schema_change" results (Section 4.2.5.5) and "SCHEMA_CHANGE" events (Section 4.2.6) has been modified, and now includes changes related to user types.
  2. Changes from v1

    • Protocol is versioned to allow old client connects to a newer server, if a newer client connects to an older server, it needs to check if it gets a ProtocolException on connection and try connecting with a lower version.
    • A query can now have bind variables even though the statement is not prepared; see Section 4.1.4.
    • A new BATCH message allows to batch a set of queries (prepared or not); see Section 4.1.7.
    • Authentication now uses SASL. Concretely, the CREDENTIALS message has been removed and replaced by a server/client challenges/responses exchanges (done through the new AUTH_RESPONSE/AUTH_CHALLENGE messages). See Section 4.2.3 for details.
    • Query paging has been added (Section 7): QUERY and EXECUTE message have an additional [int] and [bytes], and the Rows kind of RESULT message has an additional flag and value. Note that paging is optional, and a client that do not want to handle can simply avoid including the Page_size flag and parameter in QUERY and EXECUTE.
    • QUERY and EXECUTE statements can request for the metadata to be skipped in the result set returned (for efficiency reasons) if said metadata are known in advance. Furthermore, the result to a PREPARE (section 4.2.5.4) now includes the metadata for the result of executing the statement just prepared (though those metadata will be empty for non SELECT statements).
dynameat commented 7 years ago

That's would be very good if you do it in nearest future or whenever you'll be able... Hope won't take a huge for you. Thanks for your work on it here!

dynameat commented 7 years ago

Successfully connected to C* v3 with Datastax driver. From now I start using it... Thank you for conversation and help with my problem.

MatthiasWeiser commented 7 years ago

Do you have any update on this issue?

pchalamet commented 7 years ago

Hello,

I did not looked at it since you had a workaround by using Datastax drivers. That's the root cause anyway, so a bit a work is needed.

MatthiasWeiser commented 7 years ago

I am currently using the DataStax driver as well. I am not so happy with it though - I do about 100+ Mio. inserts per day via a prepared statement and the load on the Garbage collector is too high - thus I wanted to compare it with your driver. The more I look into this, the more I am tempted to roll out my own one which will only support in hardcoded style my own prepared statement.

pchalamet commented 7 years ago

You will have GC too since serialization/deserialization occurs anyway (but probably slightly better here). But yeah, have to look at it and implement new protocol version...

pchalamet commented 7 years ago

Found time to look at it seriously. Protocol v1 is only supported so it fails with a protocol not supported server side. That what was expected.

Implementation of native protocol v4 will take place in issue #93.

pchalamet commented 7 years ago

Thanks for the report. Issue follow-up will be done in #83. Thanks!