Closed Zouxxyy closed 1 month ago
Are there some configuration to disable nested projection? I am concern about bugs in nested projection, at least, we should have option to disable it.
yes, spark has a conf to enabled nestedSchemaPruning
val NESTED_SCHEMA_PRUNING_ENABLED =
buildConf("spark.sql.optimizer.nestedSchemaPruning.enabled")
.internal()
.doc("Prune nested fields from a logical relation's output which are unnecessary in " +
"satisfying a query. This optimization allows columnar file format readers to avoid " +
"reading unnecessary nested column data. Currently Parquet and ORC are the " +
"data sources that implement this optimization.")
.version("2.4.1")
.booleanConf
.createWithDefault(true)
Purpose
to #4209, Support nested col pruning, e.g.
will only obtain
course.grade
from colume-storage-format (parquet, orc)Tests
API and Format
Documentation