TIBCOSoftware / snappydata

Project SnappyData - memory optimized analytics database, based on Apache Spark™ and Apache Geode™. Stream, Transact, Analyze, Predict in one cluster
http://www.snappydata.io
Other
1.04k stars 200 forks source link

Using timestamp column from row for TTL to have different ttl per-row #483

Open ddorian opened 7 years ago

ddorian commented 7 years ago

Ability to select the column-name to do TTL against. This will make it easier to per-row TTL on 1 table by just inserting the value of the selected column.

Makes sense ?

sumwale commented 7 years ago

@ddorian Not very clear, can you explain a bit more in which API you want the ability the select the column-name? If you can give an example with some pseudo-code of what you mean then it will make it clearer.

ddorian commented 7 years ago

@sumwale I want it in the create table api. Pseudocode: create table t (tt timestamp) using 'row' options (expire 10 AFTER tt) This will mean that when (current_utc - tt).total_seconds() >= 10 the row will expire. So by changing the value of tt we can have different TTL for each row in 1 table.

Maybe there's a better/different way to implement/word it. Makes sense ?

sumwale commented 7 years ago

@ddorian Ok, understand now. Support for this was added via eviction for HDFS persistence in GemFireXD: http://gemfirexd.docs.pivotal.io/1.3.0/userguide/reference/language_ref/ref-create-table-clauses.html#topic_hyr_bs4_5k . Will something like that work for you? It can be extended for other table types having different eviction strategy, but note that the overhead of update/insert will increase for such cases.

sumwale commented 7 years ago

For that the DDL will look like (once implemented):

 create table t (tt timestamp) using 'row' options (evict_by '{fn TIMESTAMPDIFF( SQL_TSI_SECOND, current_timestamp, tt)} >= 10')
ddorian commented 7 years ago

@sumwale I don't need this for now, and eviction still persists the data.

Regarding eviction, what would be cool, is to have the eviction-model generalized, like greenplum has external-tables (but that saves as plain csv, not efficient columnar format) and keep a (start-key,end-key,statistics) for each segment (like hbase/bigtable does). This way you could write a s3 connector, real-time inserts and separate the query from storage (~bigquery clone).

sumwale commented 7 years ago

is to have the eviction-model generalized, like greenplum has external-tables

Does greenplum allow some generic form of eviction to external tables? Was not aware of that. The generic external table model is available via CREATE EXTERNAL TABLE using ... in Spark Datasources API.

keep a (start-key,end-key,statistics) for each segment (like hbase/bigtable does)

The stats are available in the column storage. We do plan on adding persistence to parquet files internally but using the same internal encoding as current column storage for best performance. The difference in performance is otherwise too large to be ignored.

This way you could write a s3 connector, real-time inserts and separate the query from storage

Separation of query from storage is available via the "smart connector" mode if required (http://snappydatainc.github.io/snappydata/howto/#how-to-access-snappydata-store-from-an-existing-spark-installation-using-smart-connector) though its performance is lower than embedded mode. Useful for extending the execution capacity of cluster beyond the storage nodes. We will continue optimizing that mode for greater operator pushdown for best performance.

ddorian commented 7 years ago

Greenplum can write to s3,hdfs, but it does in plain csv/text which isn't efficient: http://gpdb.docs.pivotal.io/4390/admin_guide/load/topics/g-s3-protocol.html

By separating query from storage, I meant that you evict to 'external' which in the case of s3 would be on another machine. Assuming 10-40Gbs+ network it would be ~like reading from local hdd. While currently you can separate 'gemfire' from 'spark', I meant you keep 'gemfire+spark=strata' in 1 jvm, and separate the storage-files (an easy but slow way would be to just mount the data-directory on fuse (and then fuse can point to s3,hdfs,etc)).

While the current hdfs-eviction, also appends commit-log to hdfs (from reading docs), which is slow for s3/external. So logs would need to be written local (and using normal replication), and writing to external/s3 only fully-built-immutable-segments.

sumwale commented 7 years ago

Greenplum can write to s3,hdfs

Right, but is there a generic way by which data can be evicted to an external table based on a SQL clause like you are proposing?

I meant you keep 'gemfire+spark=strata' in 1 jvm, and separate the storage-files

As I mentioned persisting to external parquet files is being planned (using normal Spark's hadoop storage support which provides s3 among others).

Assuming 10-40Gbs+ network it would be ~like reading from local hdd.

Though it is still an order or magnitude slower than memory access. Also where possible moving both the engine with storage provides a big advantage because more and more operators can be pushed down to the engine reducing the data to be pulled significantly (i.e. the smart connector mode).