zio / zio-quill

Compile-time Language Integrated Queries for Scala
https://zio.dev/zio-quill
Apache License 2.0
2.15k stars 348 forks source link

quill-cassandra: inserts with options create tombstones #988

Open witzatom opened 6 years ago

witzatom commented 6 years ago

Version: 2.0.0 Module: quill-cassandra Database: cassandra

Expected behavior

An configuration option that allows the user to filter out all null (None) value attributes from inserts

Actual behavior

Inserts that contain null values right now create tombstones in the Cassandra database. This causes increased database workload. Some additional info on how cassandra saves records is here: http://docs.datastax.com/en/archived/cassandra/2.0/cassandra/dml/dml_write_path_c.html#concept_ds_wt3_32w_zj__dml-compaction

Steps to reproduce the behavior

connect to a cassandra node and run:

create table demo.test (a int, b int, primary key(a));
insert into demo.test (a,b) values (1,1);
insert into demo.test (a,b) values (2,null);
insert into demo.test (a) values (3);
tracing on;
select * from demo.test;

this should return a wall of text with Read 3 live and 1 tombstone cells at the end. Note that the second entry created a tombstone but the last one did not. Set up a cassandra context in scala similar to this:

lazy val clusterWithoutSSL =
    Cluster.builder()
      .withPort(port)
      .addContactPoint(contactPoint)
      .withCredentials(username, password).build()

lazy val ctx = new CassandraAsyncContext[CamelCase](
    CamelCase,
    clusterWithoutSSL,
    keyspace,
    preparedStatementCacheSize
  )

with the appropriate configurations filled out (contactPoint, port, username, password, keyspace and preparedStatementCacheSize) Afterwards run:

    case class Test(a: Int, b: Option[Int])
    val record = Test(4, None)
    import ctx._
    ctx.run{
      query[Test].insert(lift(record))
    }

Afterwards you go to the cqlsh console and run: select * from demo.test; this should return a wall of text with Read 4 live and 2 tombstone cells at the end Therefore a new tombstone has been added.

Workaround

If it was possible to turn off prepared statements for cassandra it might be possible to solve this directly in quill. As far as i know the underlying datastax drivers allow for filtering null fields, source https://docs.datastax.com/en/developer/java-driver-dse/1.3/manual/object_mapper/using/#mapper-options search for saveNullFields, however I think turning this option on does not help with prepared statements, the source on this is here: https://www.datastax.com/dev/blog/4-simple-rules-when-using-the-datastax-drivers-for-cassandra . I couldnt find how to turn this option off in the cluster builder definition, if anybody has any hints I would love to try it out.

When I glanced through the code i found that quill implements its own mappers and reimplementing these might solve the issue (if we had non-prepared statements) https://github.com/getquill/quill/blob/c9a492119198a6e73819352f2bad47ca94e520f3/quill-cassandra/src/main/scala/io/getquill/context/cassandra/encoding/Encoders.scala#L27

Another possible point of interest would be the CqlIdiom: https://github.com/getquill/quill/blob/0606e15ad252d72875792dddc917dd7fba1e36f0/quill-cassandra/src/main/scala/io/getquill/context/cassandra/CqlIdiom.scala#L155 Where the fields with Nones could be possibly be somehow filtered out aswell.

@getquill/maintainers

cgosse commented 6 years ago

I am also seeing this, for what it's worth. Just did an insert, and when I read it back I see lots of warnings about 100 live cells and 1300 tombstones