Open martin-sucha opened 1 year ago
I would like to work on this issue.
When trying to integrate the testcontainers-go, go get github.com/testcontainers/testcontainers-go
, I’m getting errors:
../../go/pkg/mod/golang.org/x/crypto@v0.22.0/curve25519/curve25519_go120.go:9:8: cannot find module providing package crypto/ecdh
../../go/pkg/mod/google.golang.org/protobuf@v1.33.0/internal/editiondefaults/defaults.go:9:8: cannot find module providing package embed
../../go/pkg/mod/github.com/testcontainers/testcontainers-go@v0.31.0/container.go:17:2: found packages archive (archive.go) and main (example_changes.go) in /home/stanislav/go/pkg/mod/github.com/docker/docker@v25.0.5+incompatible/pkg/archive
../../go/pkg/mod/github.com/sirupsen/logrus@v1.9.3/terminal_check_unix.go:6:8: found packages unix (affinity_linux.go) and main (mkasm.go) in /home/stanislav/go/pkg/mod/golang.org/x/sys@v0.19.0/unix
../../go/pkg/mod/github.com/testcontainers/testcontainers-go@v0.31.0/docker.go:14:2: cannot find module providing package io/fs
../../go/pkg/mod/github.com/docker/docker@v25.0.5+incompatible/api/types/network/ipam.go:6:2: cannot find module providing package net/netip
The errors indicate issues with dependencies and modules that are required by the testcontainers-go package. One of the errors, cannot find module providing package crypto/ecdh
, indicates that the crypto/ecdh package, which is part of the standard library starting from Go 1.20.
The testcontainers-go package requires Go version 1.21 or higher, which is not compatible with the current Go version 1.13.
Is the Go version for the gocql project going to be upgraded? If so, I could proceed with integrating testcontainers-go. If no, I could explore alternative approaches or improve current setup of Cassandra Cluster Managament (cmm)?
I am also open to any suggestions for this situation.
It's fine to bump the minimum required Go version to 1.21, we have the following notice in the README:
Sunsetting Model
In general, the gocql team will focus on supporting the current and previous versions of Go. gocql may still work with older versions of Go, but official support for these versions will have been sunset.
I've been working on replacing the Cassandra cluster creation mechanism for running integration tests. I have already set up a basic multi-node Cassandra cluster using Testcontainers-go, and many tests pass successfully except for TestDiscoverViaProxy
.
If I understand correctly, this test ensures that the driver:
However, it appears that the proxy address is being stored in the connection pool instead of the actual Cassandra node address. I suspected there might be an issue with my setup and decided to use the existing CCM script. By default, CCM creates nodes on localhost, and each new cluster starts with a node with the IP 127.0.0.1. Consequently, each cluster ends up with hosts [127.0.0.1, 127.0.0.2, 127.0.0.3]. We then set up a proxy server with the address 127.0.0.1. This makes it hard to notice that the proxy server is being stored in the connection pool instead of the first Cassandra node because of the similar IPs (i.e., 127.0.0.1). I changed the IP addresses of the nodes to start from 127.0.0.3 and added some debugging. Here's my revised test case with debugging:
go
func TestDiscoverViaProxy(t *testing.T) {
// This (complicated) test tests that when the driver is given an initial host
// that is infact a proxy it discovers the rest of the ring behind the proxy
// and does not store the proxies address as a host in its connection pool.
// See <https://github.com/gocql/gocql/issues/481>
clusterHosts := getClusterHosts()
proxy, err := net.Listen("tcp", "localhost:0")
if err != nil {
t.Fatalf("unable to create proxy listener: %v", err)
}
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
var (
mu sync.Mutex
proxyConns []net.Conn
closed bool
)
go func() {
cassandraAddr := JoinHostPort(clusterHosts[0], 9042)
t.Log("cassandraAddr is ", cassandraAddr)
cassandra := func() (net.Conn, error) {
return net.Dial("tcp", cassandraAddr)
}
proxyFn := func(errs chan error, from, to net.Conn) {
_, err := io.Copy(to, from)
if err != nil {
errs <- err
}
}
// handle dials cassandra and then proxies requests and reponsess. It waits
// for both the read and write side of the TCP connection to close before
// returning.
handle := func(conn net.Conn) error {
cass, err := cassandra()
if err != nil {
return err
}
defer cass.Close()
errs := make(chan error, 2)
go proxyFn(errs, conn, cass)
go proxyFn(errs, cass, conn)
select {
case <-ctx.Done():
return ctx.Err()
case err := <-errs:
return err
}
}
for {
// proxy just accepts connections and then proxies them to cassandra,
// it runs until it is closed.
conn, err := proxy.Accept()
if err != nil {
mu.Lock()
if !closed {
t.Error(err)
}
mu.Unlock()
return
}
mu.Lock()
proxyConns = append(proxyConns, conn)
mu.Unlock()
go func(conn net.Conn) {
defer conn.Close()
if err := handle(conn); err != nil {
mu.Lock()
if !closed {
t.Error(err)
}
mu.Unlock()
}
}(conn)
}
}()
proxyAddr := proxy.Addr().String()
t.Log("proxyAddr is ", proxyAddr)
cluster := createCluster()
cluster.NumConns = 1
// initial host is the proxy address
cluster.Hosts = []string{proxyAddr}
session := createSessionFromCluster(cluster, t)
defer session.Close()
// we shouldnt need this but to be safe
time.Sleep(1 * time.Second)
session.pool.mu.RLock()
t.Log("clusterHosts are ", clusterHosts)
for _, host := range clusterHosts {
found := false
for _, hi := range session.pool.hostConnPools {
t.Log("host is ", host)
t.Log("hostConnPool is ", hi.host)
if hi.host.RPCAddress().String() == host {
found = true
break
}
}
if !found {
t.Errorf("missing host in pool after discovery: %q", host)
}
}
session.pool.mu.RUnlock()
mu.Lock()
closed = true
if err := proxy.Close(); err != nil {
t.Log(err)
}
for _, conn := range proxyConns {
if err := conn.Close(); err != nil {
t.Log(err)
}
}
mu.Unlock()
}
Here’s the output:
=== RUN TestDiscoverViaProxy
cassandra_test.go:3001: proxyAddr is 127.0.0.1:34435
cassandra_test.go:2934: cassandraAddr is 127.0.0.31:9042
cassandra_test.go:3016: clusterHosts are [127.0.0.31 127.0.0.32 127.0.0.33]
cassandra_test.go:3021: host is 127.0.0.31
cassandra_test.go:3022: hostConnPool is [HostInfo hostname="127.0.0.32" connectAddress="127.0.0.32" peer="127.0.0.32" rpc_address="127.0.0.32" broadcast_address="<nil>" preferred_ip="<nil>" connect_addr="127.0.0.32" connect_addr_source="connect_address" port=9042 data_centre="datacenter1" rack="rack1" host_id="0c59d814-4235-400b-be07-41bf2934590f" version="v4.0.8" state=UP num_tokens=1]
cassandra_test.go:3021: host is 127.0.0.31
cassandra_test.go:3022: hostConnPool is [HostInfo hostname="127.0.0.33" connectAddress="127.0.0.33" peer="127.0.0.33" rpc_address="127.0.0.33" broadcast_address="<nil>" preferred_ip="<nil>" connect_addr="127.0.0.33" connect_addr_source="connect_address" port=9042 data_centre="datacenter1" rack="rack1" host_id="d614ea24-0708-4b8d-998b-51e374d711ed" version="v4.0.8" state=UP num_tokens=1]
cassandra_test.go:3021: host is 127.0.0.31
cassandra_test.go:3022: hostConnPool is [HostInfo hostname="127.0.0.1" connectAddress="127.0.0.1" peer="<nil>" rpc_address="127.0.0.31" broadcast_address="127.0.0.31" preferred_ip="<nil>" connect_addr="127.0.0.1" connect_addr_source="connect_address" port=34435 data_centre="datacenter1" rack="rack1" host_id="02ec5451-6627-488a-a2a2-8c9b54f822e3" version="v4.0.8" state=UP num_tokens=1]
cassandra_test.go:3030: missing host in pool after discovery: "127.0.0.31"
cassandra_test.go:3021: host is 127.0.0.32
cassandra_test.go:3022: hostConnPool is [HostInfo hostname="127.0.0.33" connectAddress="127.0.0.33" peer="127.0.0.33" rpc_address="127.0.0.33" broadcast_address="<nil>" preferred_ip="<nil>" connect_addr="127.0.0.33" connect_addr_source="connect_address" port=9042 data_centre="datacenter1" rack="rack1" host_id="d614ea24-0708-4b8d-998b-51e374d711ed" version="v4.0.8" state=UP num_tokens=1]
cassandra_test.go:3021: host is 127.0.0.32
cassandra_test.go:3022: hostConnPool is [HostInfo hostname="127.0.0.1" connectAddress="127.0.0.1" peer="<nil>" rpc_address="127.0.0.31" broadcast_address="127.0.0.31" preferred_ip="<nil>" connect_addr="127.0.0.1" connect_addr_source="connect_address" port=34435 data_centre="datacenter1" rack="rack1" host_id="02ec5451-6627-488a-a2a2-8c9b54f822e3" version="v4.0.8" state=UP num_tokens=1]
cassandra_test.go:3021: host is 127.0.0.32
cassandra_test.go:3022: hostConnPool is [HostInfo hostname="127.0.0.32" connectAddress="127.0.0.32" peer="127.0.0.32" rpc_address="127.0.0.32" broadcast_address="<nil>" preferred_ip="<nil>" connect_addr="127.0.0.32" connect_addr_source="connect_address" port=9042 data_centre="datacenter1" rack="rack1" host_id="0c59d814-4235-400b-be07-41bf2934590f" version="v4.0.8" state=UP num_tokens=1]
cassandra_test.go:3021: host is 127.0.0.33
cassandra_test.go:3022: hostConnPool is [HostInfo hostname="127.0.0.33" connectAddress="127.0.0.33" peer="127.0.0.33" rpc_address="127.0.0.33" broadcast_address="<nil>" preferred_ip="<nil>" connect_addr="127.0.0.33" connect_addr_source="connect_address" port=9042 data_centre="datacenter1" rack="rack1" host_id="d614ea24-0708-4b8d-998b-51e374d711ed" version="v4.0.8" state=UP num_tokens=1]
--- FAIL: TestDiscoverViaProxy (4.98s)
The output shows that the proxy address (127.0.0.1:34435) is being stored as a host in the connection pool. The actual Cassandra node (127.0.0.31) is missing from the connection pool after discovery. Logs indicate that the proxy address is listed among the hostConnPools. They also confirm that the test fails because the proxy address is present, and the actual node (127.0.0.31) is missing.
I noticed from the logs that an RPC address of the replaced node is identified as 127.0.0.31, which is the original Cassandra node address. The RPC (Remote Procedure Call) address is the network address used by Cassandra nodes to communicate with clients. It is the address that clients use to send queries, retrieve data, and perform other operations. So I assume that the gocql driver appears to be working correctly by using the RPC address for internal communications. In this case, I believe the test case should compare the RPCAddress instead of the ConnectAddress to verify the discovered hosts (i.e., hi.host.RPCAddress().String() == host
instead of ConnectAddress
). Could you please confirm if this is correct, as I don't have much experience with this yet.
When I run this test using my cluster setup with testcontainers-go, I encounter an interesting moment: Cassandra version 4.0.8 fails to resolve the correct RPC address, whereas versions 4.1.1 and above operate correctly. So far, I have discovered that in the Cassandra 4.0.x series, the RPC addresses are not being discovered by the gocql driver. This is likely a bug or limitation in the earlier versions of Cassandra, leading to the proxy address being stored in the connection pool with an incorrect RPC address (0.0.0.0).
As far as I read, setting the rpc_address
to the node's specific address is generally acceptable and often recommended for production environments. It provides more control and avoids potential issues related to binding to all interfaces. Binding to a specific address reduces the risk of exposing the RPC service to unintended networks, enhancing security. It ensures that Cassandra binds to the correct network interface, avoiding issues where multiple interfaces might cause confusion.
By the way, the CCM script sets the RPC address as the node IP. So I manually added the rpc addresses to each cassandra node in my setup implementation as well.
Could you please confirm my approach to the TestDiscoverViaProxy
test and the resolution strategy I outlined? I would greatly appreciate your insights on these points.
Also, I have added a draft PR with the testcontainers-go runner. It is nearly complete, pending the test case I discribed above, removal of all CCM relations, adding the authentication test case, and perhaps some minor adjustments. Please review it, and let me know how the code looks so far. ^ @martin-sucha @Zariel
I have finalized the PR, so please take a look.
While working on it, I encountered an error message:
Error: ../../../go/pkg/mod/github.com/docker/docker@v25.0.5+incompatible/pkg/archive/archive.go:783:24: undefined: strings.CutPrefix
Error: Process completed with exit code 1.
The error, undefined: strings.CutPrefix
, indicates that the code is using the strings.CutPrefix
function, which was introduced in Go 1.20. Currently, the Go version is set to 1.19 in the GitHub Actions workflow, so this function is not available.
To fix this, I have updated the GitHub Actions workflow to ensure that the Go version is set to 1.20 or higher, removing 1.19 from the strategy matrix. Additionally, we should update the project description to reflect that Go 1.19 is no longer supported under the Supported Versions section.
^ @martin-sucha
Restarting this discussion over here but feel free to comment here if you can't comment on the JIRA
Currently integration tests use ccm to setup a Cassandra cluster for testing.
It is not easy to simulate things like topology changes, nodes joining, leaving, re-joining with a different IP address, etc.
We could rewrite integration test runner using Docker API directly or using a library like testcontainers-go.