scylladb / scylla-migrator

Migrate data extract using Spark to Scylla, normally from Cassandra/parquet files. Alt. from DynamoDB to Scylla Alternator.
https://migrator.docs.scylladb.com/stable/
Apache License 2.0
61 stars 37 forks source link

Migration failing for tables with counters #17

Open dyasny opened 4 years ago

dyasny commented 4 years ago

On a table that has a few counters in place, the migration fails with

20/05/06 17:11:57 ERROR QueryExecutor: Failed to execute: com.datastax.spark.connector.writer.RichBatchStatement@14e57d3f
com.datastax.driver.core.exceptions.InvalidQueryException: Cannot provide custom timestamp for counter updates
dyasny commented 4 years ago

@elcallio I saw you already dealt with something similar in sstableloader, maybe you can help

dyasny commented 4 years ago

schema:

TABLE ks1.table_sample (
    ks_sample_type text,
    ks_level text,
    ks text,
    counter_1 counter,
    counter_2 counter,
    counter_3 counter,
    counter_4 counter,
    PRIMARY KEY ((ks_sample_type, ks_level, ks))
) WITH bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
    AND comment = ''
    AND compaction = {'class': 'SizeTieredCompactionStrategy'}
    AND compression = {}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99.0PERCENTILE';
dyasny commented 4 years ago

@amihayday FYI this is a customer issue

tarzanek commented 4 years ago

if this fix was to disallow timestamp updates (https://github.com/scylladb/scylla-tools-java/commit/c9080024576a95b0c1c406e909c6b07c33d176a2)

tarzanek commented 4 years ago

the idea I had around this was to ignore the timestamps for counters, early untested draft is in https://github.com/tarzanek/scylla-migrator/tree/counter-timestamps but it's just an idea, the cassandra writer won't have to pick it up properly (and I didn't validate)

Dobiasd commented 4 years ago

Just ran into a similar problem (using the latest version from scylla-migrator/tree/master) when trying to migrate a table with counter columns (without TTL) like the following one from Cassandra to Scylla:

CREATE TABLE foo.bar (
    some_id    bigint,
    a          counter,
    b          counter,
    c          counter,

    PRIMARY KEY (some_id)
);

it results in:

java.io.IOException: Failed to write statements to foo.bar. The
latest exception was
  Invalid null value for counter increment
eyalgutkind commented 3 years ago

We have another request the support of counter tables. Is it possible? The error received is:

java.io.IOException: Failed to write statements to ks.tbl The
latest exception was
  Cassandra timeout during COUNTER write query at consistency LOCAL_QUORUM (2 replica were required but only 0 acknowledged the write)

The data model for the target data is :

  CREATE TABLE ks.tbl (
    pk text,
    ck text,
    count1 counter,
    count2 counter,
Primary Key (pk,ck));
tarzanek commented 2 years ago

with recent code base it works only for SINGLE counter column if there are more you end up with:

22/02/21 15:24:45 WARN TaskSetManager: Lost task 19.0 in stage 0.0 (TID 19, 10.130.200.146, executor 0): java.io.IOException: Failed to write statements to accounts.account_store_counters. The
latest exception was
  Invalid null value for counter increment

in case there are null counter columns

tarzanek commented 2 years ago

filtering the null values in terms of counters needs to be done to get this working

tzach commented 3 months ago

@tarzanek is this issue still valid?