microsoft / hyperspace

An open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.
https://aka.ms/hyperspace
Apache License 2.0
424 stars 115 forks source link

[FEATURE REQUEST]: Create helper function to check whether index is actually used in the plan #492

Open paryoja opened 3 years ago

paryoja commented 3 years ago

Feature requested

I need a function to check whether Spark actually used an index I intended to use. It could be helpful to understand the behavior of hyperspace indexes and optimize the behavior of it.

Acceptance criteria

Describe what you consider (e.g., functionality requirement) has to be true to mark this feature as complete

Success criteria

Define any measurements that need to be performed

Additional context

In the DeltaLakeIntegrationTest.scala, it has isIndexVersionUsed and isIndexUsed for checking the test results. It could be better to expose the functions for the general use.

sezruby commented 3 years ago

hs.explain(df) returns used indexes, but I think we need to improve the output result so that it shows the used log version & the number of times each index applied.