apache / cassandra-gocql-driver

GoCQL Driver for Apache Cassandra®
https://cassandra.apache.org/
Apache License 2.0
2.58k stars 623 forks source link

Batched-Insert is causing SSTable Tombstones. #1044

Closed ramtg closed 6 years ago

ramtg commented 6 years ago

Please answer these questions before submitting your issue. Thanks!

What version of Cassandra are you using?

Cassandra Server Release v3.0.9

What version of Gocql are you using?

CQLVersion: 3.0.0 (as of Jan 1, 2018)

What did you do?

Batch inserted prepared statements.

insertCQL:= fmt.Sprintf("INSERT INTO %v (lsid, f1, f2) VALUES (?, ?, ?)", "tbl_sparse_values")

// Execute test
batch := session.NewBatch(gocql.LoggedBatch)

//batch.Query(insertCQL, 999, float32(99.99), nil) // Results in SSTable Tombstone.

batch.Query(insertCQL, 999, float32(99.99)) // Skipping a parameter would solve the tombstone problem, but gocql would not throw run-time error: gocql: batch statement 0 expected 3 values send got 2

What did you expect to see?

(as indicated in the sstabledump JSON export).

What did you see instead?

https://raw.githubusercontent.com/ramtg/java-cql-batch/master/CQLBatchInsertingJavaClientNoTombs.java

If you are having connectivy related issues please share the following additional information

Describe your Cassandra cluster

please provide the following information

select cql_version, native_protocol_version, release_version from system.local; -- cql_version: 3.4.0, -- native_protocol_version: 4 -- release_version: 3.0.9

Zariel commented 6 years ago

For this can you use an unset column via UnsetValue ?

ramtg commented 6 years ago

@Zariel Thank you for the suggestion.

I'm not sure what I'm doing wrong, but I get the below error when I tried specifying UnsetValue in Batch.Query():

"java.lang.IndexOutOfBoundsException: index: 28, length: 158052 (expected: range(0, 84))" .

Here's the code I tried:

   // CREATE TABLE tbl_sparse_values(id int, sortkey int, v1 int, v2 int, v3 int, PRIMARY KEY (id, sortkey));
insertStmt := `INSERT INTO tbl_sparse_values(id, sortkey, v1, v2) VALUES (:id, :sortkey, :v1, :v2)`
batch.Query(insertStmt,
    gocql.NamedValue("id", 333),
    gocql.NamedValue("sortkey", 9),
    gocql.NamedValue("v1", gocql.UnsetValue),
    gocql.NamedValue("v2", 2))

if err := session.ExecuteBatch(batch); err != nil {
    log.Println(err)
    // Results in the below error:
    // java.lang.IndexOutOfBoundsException: index: 28, length: 158052 (expected: range(0, 84))
}
ramtg commented 6 years ago

@Zariel It appears Batch.Query() doesn't currently accept named values. Could you please confirm ?

Zariel commented 6 years ago

Reading the docs for the protocol it says this about named values in batches

        0x40: With names for values. If set, then all values for all <query_i> must be
              preceded by a [string] <name_i> that have the same meaning as in QUERY
              requests [IMPORTANT NOTE: this feature does not work and should not be
              used. It is specified in a way that makes it impossible for the server
              to implement. This will be fixed in a future version of the native
              protocol. See https://issues.apache.org/jira/browse/CASSANDRA-10246 for
              more details].