masumsoft / express-cassandra

Cassandra ORM/ODM/OGM for NodeJS with support for Apache Cassandra, ScyllaDB, Datastax Enterprise, Elassandra & JanusGraph.
http://express-cassandra.readthedocs.io
GNU Lesser General Public License v3.0
227 stars 67 forks source link

ttl value is not prepared resulting in memory leak #203

Closed manpreet-compro closed 4 years ago

manpreet-compro commented 4 years ago

Use Case

We are adding TTL to existing data in our cassandra DB. Logic is to fetch the records and then reinsert using update query with TTL. TTL value is not fixed but calculated based on write time of record.

The Code is in form as pasted below where timeCalculated is a dynamic value. const options = { consistency: this.writeConsistency, ttl: timeCalculated } models.instance.tablename.updateAsync({ id: uuid }, newData, options)

Issue

This is transformed by driver to CQL statement UPDATE "tablename" USING TTL 43505210 SET "field1"=?, "field2"=?, "field3"=? WHERE "id" = ?;

So each time update function is run, it is treated as a new statement because TTL value is not a variable and each query by default is considered as prepared, so we get statements like

UPDATE "tablename" USING TTL 43505210 SET "field1"=?, "field2"=?, "field3"=? WHERE "id" = ?; UPDATE "tablename" USING TTL 42624031 SET "field1"=?, "field2"=?, "field3"=? WHERE "id" = ?; UPDATE "tablename" USING TTL 47640865 SET "field1"=?, "field2"=?, "field3"=? WHERE "id" = ?; and so on...

Impact

We have millions of records and we are faced with 2 kind of issues

  1. Node Program runs our of memory after a certain time with error FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - process out of memory. Heap memory increases at a linear rate
  2. In Cassandra DB, records count in table prepared_statements increased too much that we were facing long delays in restarting cassandra service because it re-prepare all statements on startup.

As of now after finding out cause of memory leak and delays in cassandra startup, we have deleted records from prepared_statements table and updated node program to use prepare:false for this operation. Although it took as around 4 weeks to find out this was the cause.

Request to allow preparing TTL value and till the time it is implemented, can we mention in docs that TTL is not prepared and be cautious with dynamic TTLs for same query. Also as per link https://issues.apache.org/jira/browse/CASSANDRA-4450, I believe support of preparing TTL is available

masumsoft commented 4 years ago

fixed in v2.3.1