dlt-hub / dlt

data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
https://dlthub.com/docs
Apache License 2.0
2.65k stars 176 forks source link

sets default concurrency for blob upload for adlfs to 1 #1779

Closed rudolfix closed 2 months ago

rudolfix commented 2 months ago

Description

Concurrent blob uploads dlt limits the number of concurrent connections for a single uploaded blob to 1. By default adlfs that we use, splits blobs into 4 MB chunks and uploads them concurrently which leads to gigabytes of used memory and thousands of connections for a larger load packages. You can increase the maximum concurrency as follows:

[destination.filesystem.kwargs]
max_concurrency=3
netlify[bot] commented 2 months ago

Deploy Preview for dlt-hub-docs ready!

Name Link
Latest commit b6a6bf751a30222210b96335dc4640132af239ac
Latest deploy log https://app.netlify.com/sites/dlt-hub-docs/deploys/66d62edee7216800088ec80f
Deploy Preview https://deploy-preview-1779--dlt-hub-docs.netlify.app
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.