In my opinion the task for optimizing delta table doesn't make sense.
The results of the "Create a Fabric notebook and load external data" steps are cached so the execution of the same steps will be a lot faster, no matter what Spark Config Settings we set.
Besides that the Spark Config settings, which the tasks wants to show are anyway activated by default: https://learn.microsoft.com/en-us/fabric/data-engineering/delta-optimization-and-v-order?tabs=pyspark
So the code in Optimize Delta table writes does in the end the exact same thing as the code from "Create a Fabric notebook and load external data".
Instead of that it would probably make more sense to compare the already optimized code from "Create a Fabric notebook and load external data" with a version where we change the Spark Config Optimization Parameters to false after restarting the session. To have comparable results.
But besides that my understanding of the two Spark config Optimization Parameters is that writing takes ~15% longer when they are activated, so the statement "Now, take note of the run times for both code blocks. Your times will vary, but you can see a clear performance boost with the optimized code." is not true and it should even take longer with the optimized code. Only when we read the data we should see the performance boost
Module: Ingest data with Spark and Microsoft Fabric notebooks
Lab/Demo: 10 - Ingest data with Spark and Microsoft Fabric notebooks
Task: Optimize Delta table writes
Step: 00
Link to Lab Instructions: https://github.com/MicrosoftLearning/mslearn-fabric/blob/main/Instructions/Labs/10-ingest-notebooks.md#optimize-delta-table-writes
In my opinion the task for optimizing delta table doesn't make sense. The results of the "Create a Fabric notebook and load external data" steps are cached so the execution of the same steps will be a lot faster, no matter what Spark Config Settings we set. Besides that the Spark Config settings, which the tasks wants to show are anyway activated by default: https://learn.microsoft.com/en-us/fabric/data-engineering/delta-optimization-and-v-order?tabs=pyspark So the code in Optimize Delta table writes does in the end the exact same thing as the code from "Create a Fabric notebook and load external data".
Instead of that it would probably make more sense to compare the already optimized code from "Create a Fabric notebook and load external data" with a version where we change the Spark Config Optimization Parameters to false after restarting the session. To have comparable results.
But besides that my understanding of the two Spark config Optimization Parameters is that writing takes ~15% longer when they are activated, so the statement "Now, take note of the run times for both code blocks. Your times will vary, but you can see a clear performance boost with the optimized code." is not true and it should even take longer with the optimized code. Only when we read the data we should see the performance boost