aws-samples / cql-replicator

CQLReplicator is a migration tool that helps you to replicate data from Cassandra to AWS Services
Apache License 2.0
16 stars 8 forks source link

replication error - CassandraTypeException: null #90

Closed frozensky closed 9 months ago

frozensky commented 10 months ago

error on replication --PROCESS_TYPE replication

2024-01-28 01:45:12,749 ERROR [Executor task launch worker for task 0.0 in stage 5.0 (TID 43)] executor.Executor (Logging.scala:logError(98)): Exception in task 0.0 in stage 5.0 (TID 43)
CassandraTypeException: null
    at GlueApp$.$anonfun$main$33(CQLReplicator.scala:460) ~[CQLReplicator.scala.jar:?]
    at scala.collection.immutable.NewRedBlackTree$._foreachKey(RedBlackTree.scala:291) ~[scala-library-2.12.15.jar:?]
    at scala.collection.immutable.NewRedBlackTree$.foreachKey(RedBlackTree.scala:287) ~[scala-library-2.12.15.jar:?]
    at scala.collection.immutable.TreeSet.foreach(TreeSet.scala:254) ~[scala-library-2.12.15.jar:?]
    at GlueApp$.rowToStatement$1(CQLReplicator.scala:441) ~[CQLReplicator.scala.jar:?]
    at GlueApp$.$anonfun$main$21(CQLReplicator.scala:365) ~[CQLReplicator.scala.jar:?]
    at scala.collection.Iterator.foreach(Iterator.scala:943) ~[scala-library-2.12.15.jar:?]
    at scala.collection.Iterator.foreach$(Iterator.scala:943) ~[scala-library-2.12.15.jar:?]
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1431) ~[scala-library-2.12.15.jar:?]
    at GlueApp$.$anonfun$main$20(CQLReplicator.scala:364) ~[CQLReplicator.scala.jar:?]
    at GlueApp$.$anonfun$main$20$adapted(CQLReplicator.scala:347) ~[CQLReplicator.scala.jar:?]
    at org.apache.spark.rdd.RDD.$anonfun$foreachPartition$2(RDD.scala:1011) ~[spark-core_2.12-3.3.0-amzn-1.jar:3.3.0-amzn-1]
    at org.apache.spark.rdd.RDD.$anonfun$foreachPartition$2$adapted(RDD.scala:1011) ~[spark-core_2.12-3.3.0-amzn-1.jar:3.3.0-amzn-1]
    at org.apache.spark.SparkContext.$anonfun$runJob$5(SparkContext.scala:2269) ~[spark-core_2.12-3.3.0-amzn-1.jar:3.3.0-amzn-1]
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) ~[spark-core_2.12-3.3.0-amzn-1.jar:3.3.0-amzn-1]
    at org.apache.spark.scheduler.Task.run(Task.scala:138) ~[spark-core_2.12-3.3.0-amzn-1.jar:3.3.0-amzn-1]
    at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548) ~[spark-core_2.12-3.3.0-amzn-1.jar:?]
    at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1517) ~[spark-core_2.12-3.3.0-amzn-1.jar:3.3.0-amzn-1]
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551) ~[spark-core_2.12-3.3.0-amzn-1.jar:?]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_392]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_392]
    at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_392]

Here is the schema of the source table

cqlsh> describe table  personalcatalog.schema_migration;

CREATE TABLE personalcatalog.schema_migration (
    applied_successful boolean,
    version int,
    executed_at timestamp,
    script text,
    script_name text,
    PRIMARY KEY (applied_successful, version)
) WITH CLUSTERING ORDER BY (version ASC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND speculative_retry = '99PERCENTILE';
$ ./cqlreplicator --state stats --landing-zone s3://hnt-cqlreplicator-stg --src-keyspace personalcatalog --src-tabl
e schema_migration

·······································································
[2024-01-27T18:12:10-08:00] OS: Darwin
[2024-01-27T18:12:19-08:00] Discovered rows in personalcatalog.schema_migration is 6
[2024-01-27T18:12:19-08:00] Replicated rows in ks_test_cql_replicator.test_cql_replicator is 0
nwheeler81 commented 10 months ago

applied_successful is the problem b/c it throws CassandraTypeException, currently CQLReplicator doesn't support boolean type in primary key.