k2v-academy / K2View-Academy

Other
0 stars 0 forks source link

Change DbLoad batch size due to restrictions on target database #789

Closed F78aPg0zjWnRKWkYrBd6lShvmAYl8doniMtEnFN closed 1 year ago

F78aPg0zjWnRKWkYrBd6lShvmAYl8doniMtEnFN commented 1 year ago

Fabric 6.5.9

On a customer project, there is a data pipeline to Azure SQL Pool, that is a flavor of SQL Server oriented to DWH workloads (part of Azure Synapse Analytics).

On such pipeline, data was being loaded using the DbLoad actor.

On some loads, the following error was occurring...

Error Message: Caused By: java.sql.BatchUpdateException: 112007;Exceeded the memory limit of 20 MB per session for prepared statements. Reduce the number or size of the prepared statements.

This is a hard memory limit and cannot be changed. See DWH Service Capacity Limits for details.

There was no optimization to be done on the SQL statement and the only option left was to reduce the batch size that DbLoad was sending. See DB Actors for details (e.g. default batch size is 1000).

So, how can the batch size be changed?

F78aPg0zjWnRKWkYrBd6lShvmAYl8doniMtEnFN commented 1 year ago

Add the following section and parameters on config.ini, save and restart Fabric.

[io]
# Default batch size
#IO_JDBC_BATCH_SIZE=1000
# Batch size to address target database prepared statement memory limit error.
IO_JDBC_BATCH_SIZE=500
F78aPg0zjWnRKWkYrBd6lShvmAYl8doniMtEnFN commented 1 year ago

The change above impacts all DB actors that have the batch parameter when it is set as True.

F78aPg0zjWnRKWkYrBd6lShvmAYl8doniMtEnFN commented 1 year ago

Feature request: add batch size parameter to be included on DB actors, where values lower than the default can be set, considering that the need to reduce batch size is related to target.

tZajFGR0CidT8AVERBHw8puD36HY6oWViykmIIb commented 1 year ago

The parameter to update the batch size is hidden in the config.ini, since it rarely is updated. We will consider adding a batch size parameter to the actor.