apache / iceberg-go

Apache Iceberg - Go
https://iceberg.apache.org/
Apache License 2.0
142 stars 34 forks source link

Add option to set max concurrency for table scan operations #198

Closed glkz closed 1 week ago

glkz commented 1 week ago

This PR introduces an option to set MaxConcurrency for the Scanner, allowing users to control the level of concurrent downloads in Scanner. This configuration can be beneficial for workloads running multiple simultaneous queries.

Additionally, this PR changes the default value of max concurrency from runtime.NumCPU() to runtime.GOMAXPROCS. While both are initially set to the number of available CPUs, GOMAXPROCS is adjustable by users, providing greater flexibility. This change addresses a limitation of using runtime.NumCPU(), which does not account for cgroup limits. For instance, in Kubernetes environments, using runtime.NumCPU() may incorrectly assume access to all CPU cores on a node, rather than only the cores allocated to the specific pod. This mismatch can lead to performance degradation in Kubernetes-deployed applications. See uber-go/automaxprocs.