akka / akka-persistence-cassandra

A replicated Akka Persistence journal backed by Apache Cassandra
https://doc.akka.io/libraries/akka-persistence-cassandra/current/
Other
328 stars 135 forks source link

Test with AWS Keyspaces #865

Open patriknw opened 3 years ago

patriknw commented 3 years ago

We know some differences:

patriknw commented 3 years ago

Notes of my testing of Akka Persistence Cassandra with Keyspaces.

Functional Differences: Amazon Keyspaces (for Apache Cassandra) versus Apache Cassandra: https://docs.aws.amazon.com/keyspaces/latest/devguide/functional-differences.html

Supported consistency levels: https://docs.aws.amazon.com/keyspaces/latest/devguide/consistency.html

Setup

Console: https://eu-central-1.console.aws.amazon.com/keyspaces/home?region=eu-central-1#service

Create truststore according to: https://docs.aws.amazon.com/keyspaces/latest/devguide/using_java_driver.html#using_java_driver.BeforeYouBegin

This truststore is used via the parameters: -Djavax.net.ssl.trustStore=path_to_file/cassandra_truststore.jks -Djavax.net.ssl.trustStorePassword=amazon The password is what you used when creating the truststore.

Create IAM user according to: https://docs.aws.amazon.com/keyspaces/latest/devguide/programmatic.credentials.html#programmatic.credentials.ssc

There are instructions to create credentials for keyspaces, but I never needed those later. I used the Access key id for the IAM user as environment variables. export AWS_ACCESS_KEY=AKIA3WD6YIW5BUIBWFHC export AWS_SECRET_ACCESS_KEY=

These environment variables are not fully documented but the error message is pretty clear if they are not defined.

Add the authentication plugin: https://docs.aws.amazon.com/keyspaces/latest/devguide/using_java_driver.html#java_tutorial.SigV4 "software.aws.mcs" % "aws-sigv4-auth-cassandra-java-driver-plugin" % "4.0.2",

Config

Configure the authentication plugin in application.conf (CassandraLifecycle.config):

    datastax-java-driver { 
      basic.contact-points = [ "cassandra.eu-central-1.amazonaws.com:9142"]
      basic.request.consistency = LOCAL_QUORUM
      basic.load-balancing-policy {
        class = DefaultLoadBalancingPolicy
        local-datacenter = eu-central-1
      }
      advanced {
        auth-provider = {
          class = software.aws.mcs.auth.SigV4AuthProvider
          aws-region = eu-central-1
        }
        ssl-engine-factory {
          class = DefaultSslEngineFactory
        }
      }
    }

Note that the contact points must also be configured, which isn’t fully documented. You find the endpoints for different regions at https://docs.aws.amazon.com/keyspaces/latest/devguide/programmatic.endpoints.html

patriknw commented 3 years ago

Test results

It was difficult to run tests because keyspace and table creation are asynchronous and not finished when the returned Future is completed. In the end I abandoned the automatic creation and created/dropped the tables with cql script via cqlsh.

Successful tests:

akka.persistence.cassandra.journal.BufferSpec
akka.persistence.cassandra.journal.CassandraJournalMetaSpec
akka.persistence.cassandra.journal.PubSubThrottlerSpec
akka.persistence.cassandra.journal.TagWriterSpec
akka.persistence.cassandra.journal.CassandraEventUpdateSpec
akka.persistence.cassandra.journal.CassandraSerializationSpec
akka.persistence.cassandra.journal.TagWritersSpec
akka.persistence.cassandra.journal.CassandraIntegrationSpec
akka.persistence.cassandra.journal.CassandraJournalSpec
akka.persistence.cassandra.journal.TimeBucketSpec
akka.persistence.cassandra.journal.CassandraJournalDeletionSpec
akka.persistence.cassandra.journal.TagScanningSpec
akka.persistence.cassandra.EventsByTagRestartSpec
akka.persistence.cassandra.RetriesSpec
akka.persistence.cassandra.healthcheck.CassandraHealthCheckCustomFailingQuerySpec
akka.persistence.cassandra.CassandraCorruptJournalSpec
akka.persistence.cassandra.CassandraQueryJournalSettingsSpec
akka.persistence.cassandra.EventsByTagCrashSpec
akka.persistence.cassandra.reconciler.TagQuerySpec
akka.persistence.cassandra.reconciler.DeleteTagViewForPersistenceIdSpec
akka.persistence.cassandra.sharding.ClusterShardingQuickTerminationSpec
akka.persistence.cassandra.query.EventsByPersistenceIdFastForwardSpec
akka.persistence.cassandra.query.EventsByPersistenceIdWithControlSpec
akka.persistence.cassandra.query.EventsByTagLongRefreshIntervalSpec
akka.persistence.cassandra.query.EventsByTagSpec
akka.persistence.cassandra.query.EventsByTagStageSpec
akka.persistence.cassandra.query.EventsByTagStrictBySeqNoManyInCurrentTimeBucketSpec
akka.persistence.cassandra.query.javadsl.CassandraReadJournalSpec
akka.persistence.cassandra.query.scaladsl.CassandraReadJournalSpec
akka.persistence.cassandra.query.CassandraQueryJournalOverrideSpec 
akka.persistence.cassandra.query.EventsByPersistenceIdMultiPartitionGapSpec
akka.persistence.cassandra.query.EventsByTagDisabledSpec
akka.persistence.cassandra.query.EventsByTagSpecBackTracking
akka.persistence.cassandra.query.EventsByTagStrictBySeqMemoryIssueSpec
akka.persistence.cassandra.query.EventsByTagZeroEventualConsistencyDelaySpec (282 s)
akka.persistence.cassandra.query.EventsByPersistenceIdSpec
akka.persistence.cassandra.query.EventAdaptersReadSpec
akka.persistence.cassandra.query.EventsByTagSpecBackTrackingLongRefreshInterval
akka.persistence.cassandra.query.EventsByTagStrictBySeqNoEarlyFirstOffsetSpec (85 s)

Failing test (in order of importance):

Failing becuse truncate not supported

Performance related tests that I didn't try much (they probably fail due to throttling):

akka.persistence.cassandra.journal.CassandraLoadTypedSpec
akka.persistence.cassandra.journal.CassandraJournalPerfSpec
akka.persistence.cassandra.journal.RecoveryLoadSpec
akka.persistence.cassandra.journal.ManyActorsLoadSpec
akka.persistence.cassandra.journal.StartupLoadSpec
akka.persistence.cassandra.journal.CassandraLoadSpec
akka.persistence.cassandra.EventsByTagStressSpec 
akka.persistence.cassandra.CassandraEventsByTagLoadSpec
ignasi35 commented 3 years ago

akka.persistence.cassandra.query.EventsByTagFindDelayedEventsSpec

Since the database is too slow to store the initial data, the conditions to run the assertions are not met. I'm rewriting this test.

ignasi35 commented 3 years ago

akka.persistence.cassandra.snapshot.CassandraSnapshotStoreSpec

The arguments to the prepared statement are in the wrong order. We're passing an int to a column that expected text and a String to a column that expects an int. +1 for Keyspaces (since Cassandra itself doesn't seem to care).

Edit: I think passing the arguments in the wrong order is purposeful but is not leaving the database in the state the test needs. I'm investigating a bit more. It just happens that Keyspaces is more strict and doesn't allow setting an empty String on a column of type int.

ignasi35 commented 3 years ago

akka.persistence.cassandra.cleanup.CleanupSpec

I've replaced table and keyspace truncation with drop/create table. Test is flaky because Keyspaces is very slow in creating and dropping tables but it is no longer a consistent failure.

ignasi35 commented 3 years ago

akka.persistence.cassandra.healthcheck.CassandraHealthCheckCustomQueryEmptyResultSpec

Doesn't really need to use Truncate it only needs an empty table.

ignasi35 commented 3 years ago

akka.persistence.cassandra.query.TagViewSequenceNumberScannerSpec

Rewritten without truncate.

ignasi35 commented 3 years ago

akka.persistence.cassandra.healthcheck.CassandraHealthCheckDefaultQuerySpec but there are also other failures

I couldn't reproduce these other failures. Changing the default SQL query as mentioned above (health-check-cql = "SELECT * FROM system.local") worked for me.

ignasi35 commented 3 years ago

The main problem running the test suite is still the time it takes to drop/create all tables every time.

A way to make these tests run on Keyspaces faster would be to rewrite them with the following consideration:

Then, we could have a separate process in charge of dropping and creating all necessary Keyspaces. This tool/process would guarantee all required keyspaces exist and have all the necessary tables to run each test (note some tests don't even require the APC tables). The tool would be run before the test suite.

A side effect of this improvement is that (maybe?) tests could even be run in parallel.