Closed DecisionSystems closed 4 years ago
Hi @DecisionSystems ,
You should use spark.redis.db
config option, not spark.redis.dbNum
. Documentation https://github.com/RedisLabs/spark-redis/blob/master/doc/configuration.md
Yes. Cleary the documentation needs to be cleared up by perhaps removing references to the parameter dbNum
I also was able to change the redis db context that I wanted to write to within the same spark session using pyspark.
It turns out that you must redefine the spark session using the .getOrCreate() method. As per the pyspark doc: https://spark.apache.org/docs/latest/api/python/pyspark.sql.html “In case an existing SparkSession is returned, the config options specified in this builder will be applied to the existing SparkSession.”
This is exactly what I needed to do. After that, it worked like a charm. So, I suppose the only issue here is the incorrect reference to dbNum.
Thanks!
Docs updated in https://github.com/RedisLabs/spark-redis/pull/259
For some reason, I can't seem get a df.write to pickup a change to the the Redis db. I can confirm that the pyspark.sql.conf.RuntimeConfig has been changed, however only the default is ever used. This issue was reviewed by redis labs.
The following code works however the db context does not change to 3 so all tables on write go to 0 which is the default. If I were to change the option("keys.pattern"...) in the df.write to be the same as the df.read it says the tables already exist. Here's the code. Using Python with Redis 3.4.1 / spark-redis-2.3.1-M2 / Spark 2.3.1 / Python 3.6.5