MicrosoftDocs / azure-docs

Open source documentation of Microsoft Azure
https://docs.microsoft.com/azure
Creative Commons Attribution 4.0 International
10.28k stars 21.47k forks source link

Default max file size for optimize write seems to be 1 GiB, not 128MB #120764

Closed KoenVerbeeck closed 7 months ago

KoenVerbeeck commented 8 months ago

The page says the default file size is 128MB. However, when requesting the session config spark.microsoft.delta.optimizeWrite.binSize, the result is 1073741824, which is 1 GiB. Loading sample data to a non-partitioned table with optimize write explicitly set to true gives me a Parquet file size of 1 GiB, so the 128MB seems false.

image


Document Details

Do not edit this section. It is required for learn.microsoft.com ➟ GitHub issue linking.

KoenVerbeeck commented 8 months ago

At closer inspection, this page seems related to Synapse Analytics, and not Fabric, which might explain the difference. I was linked to this documentation page by a Fabric doc page which is incredibly confusing.

ManoharLakkoju-MSFT commented 8 months ago

@KoenVerbeeck Thanks for your feedback! We will investigate and update as appropriate.

RamanathanChinnappan-MSFT commented 7 months ago

@KoenVerbeeck Thanks for your contribution. Please add your ideas in below link, so our production team can review it and update the same. Ideas · Community (azure.com)