An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
Instead of calling SchemaUtils.findNestedFieldIgnoreCase for each column, we prepare a map with SchemaUtils.explode before, and perform map lookups during iteration.
This speeds up this function on wide tables. It may still be slow for tables with deeply nested schemas because the path needs to be built every time, but there should be no regression.
How was this patch tested?
Manual profiling for an alter table add columns query:
Which Delta project/connector is this regarding?
Description
Instead of calling
SchemaUtils.findNestedFieldIgnoreCase
for each column, we prepare a map withSchemaUtils.explode
before, and perform map lookups during iteration.This speeds up this function on wide tables. It may still be slow for tables with deeply nested schemas because the path needs to be built every time, but there should be no regression.
How was this patch tested?
Manual profiling for an alter table add columns query:
Before: (~13s)
After: (~3s)
Does this PR introduce any user-facing changes?