scylladb / scylla-tools-java

Apache Cassandra, supplying tools for Scylla
Apache License 2.0
53 stars 85 forks source link

sstableloader --ignore-missing-columns does not handle multiple columns. #214

Open elcallio opened 3 years ago

elcallio commented 3 years ago

Because the command line parser does API is confusing and the originating programmer (me) is stupid.

juliayakovlev commented 3 years ago

I ran the test with this fix, but the test still fails with same error:

Test scenario to reproduce:

  1. CREATE KEYSPACE ks WITH replication={'class':'SimpleStrategy', 'replication_factor':1}
  2. CREATE COLUMNFAMILY cf (key int, c1 text, c2 text, c3 text, PRIMARY KEY(key, c1))
  3. Insert 10 rows
  4. DROP TABLE cf
  5. CREATE COLUMNFAMILY cf (key int, c1 text, PRIMARY KEY(key, c1))
  6. Run sstableloader:
    /home/juliay/.ccm/scylla-repository/unstable/master/2020-11-18T08_57_53Z/scylla-tools-java/bin/sstableloader -d 127.0.57.1 --ignore-missing-columns c2,c3 /tmp/tmp_bt7kouk/ks/cf -v

Load failed with error:

java.lang.RuntimeException: Unknown column c3 during deserialization
org.apache.cassandra.db.SerializationHeader$Component.toHeader(SerializationHeader.java:321)
org.apache.cassandra.io.sstable.format.SSTableReader.openForBatch(SSTableReader.java:441)
com.scylladb.tools.BulkLoader.openFile(BulkLoader.java:1517)
com.scylladb.tools.BulkLoader.process(BulkLoader.java:1562) com.scylladb.tools.BulkLoader.lambda$main$1(BulkLoader.java:1364)
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
java.util.concurrent.FutureTask.run(FutureTask.java:266)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)
roydahan commented 3 years ago

@elcallio could you please take a look on this?

elcallio commented 3 years ago

The above example works perfectly fine for me with 8080009 applied. Is this still an issue, not just branches not updated?

michoecho commented 2 months ago

Reopening because of the same problem as in https://github.com/scylladb/scylla-tools-java/issues/216#issuecomment-2297045852. Closing the issue activated some broken dtests and broke CI.