RedisLabs / spark-redis

A connector for Spark that allows reading and writing to/from Redis cluster
BSD 3-Clause "New" or "Revised" License
936 stars 367 forks source link

Dataframe TTL doesn't include schema keys #356

Closed folfix closed 1 year ago

folfix commented 1 year ago

Hi! I've noticed that following configuration:

df.write()
    .format("org.apache.spark.sql.redis")
    .option("table", "example")
    .option("ttl", 60)
    .save();

allows me to keep data in Redis for 60 seconds. It works as expected. But I've observed that schema key (_spark:example:schema) is being kept. This causes a problem when reading: I can't tell if data is in cache or not. Because, following:

Dataset<Row> loadedDataset = spark.getSession().read()
        .format("org.apache.spark.sql.redis")
        .option("table", "example")
        .load();

responds with valid Dataset (none exception thrown).

Is there any way to propagate TTL to schema keys as well? Thanks!

fe2s commented 1 year ago

Hi @folfix, the TTL option affects rows only and not table schema. Can you check that returned dataframe is empty?

sazzad16 commented 1 year ago

Closed due to inactivity.