How to cache scramble tables in Spark?

verdict-project / verdict

Interactive-Speed Analytics: 200x Faster, 200x Fewer Cluster Resources, Approximate Query Processing

Apache License 2.0

248 stars 66 forks source link

Sorry, it seems it does not work. I cached the scramble lineitem table as well as the verdictdbmeta table. I can see the tables are cached from the Spark UI, however, the TPC-H Q1 still takes the same amount of time as when the tables are not cached ...

Here's my code:

  verdict.setDefaultSchema(schema) // tpch1g
  verdict.sql("bypass cache table lineitem")
  verdict.sql("bypass cache table orders")
  verdict.sql("bypass cache table verdictdbmeta.verdictdbmeta")
  verdict.sql("bypass cache table lineitem_scramble")
  verdict.sql("bypass cache table orders_scramble")
  val q_verdict = spark.sparkContext.getConf.get("spark.verdictdb.query") // Q1, Q6, or Q14
  val rs_verdict = verdict.sql(q_verdict)
  rs_verdict.print()

verdict-project / verdict

How to cache scramble tables in Spark? #362