grosser / parallel

Ruby: parallel processing made simple and fast
MIT License
4.16k stars 254 forks source link

Use cgroups aware processor count by default #348

Closed y-yagi closed 2 months ago

y-yagi commented 2 months ago

In containerized environments, a number of CPU cores isn't the same as the available CPUs. In this case, we need to consider cgroups.

concurrent-ruby now has the method to get that since v1.3.1. I think it's better to use this for setting more container environment friendly default value. Ref: https://github.com/ruby-concurrency/concurrent-ruby/pull/1038

I keep Parallel.processor_count as is to keep supporting setting processor count via env(PARALLEL_PROCESSOR_COUNT).

grosser commented 2 months ago

1.26.0

ahorek commented 2 months ago

this change causes "negative array size" errors because

> Concurrent.available_processor_count.floor
 => -1
> Concurrent.physical_processor_count
 => 6
> Concurrent.processor_count
 => 12
> Etc.nprocessors
 => 12
> Concurrent.cpu_quota.floor
 => -1

according to docs cpu.cfs_quota_us = -1: Unlimited CPU time. cpu.cfs_quota_us = 0: No CPU time (tasks are blocked). cpu.cfs_quota_us > 0: Limited CPU time as specified.

it looks like a bug in https://github.com/ruby-concurrency/concurrent-ruby/pull/1038

grosser commented 2 months ago

Ah great 😞 I'll revert in a bit

On Thu, Aug 8, 2024, 3:39 PM Pavel Rosický @.***> wrote:

this change causes "negative array size" errors because

Concurrent.available_processor_count.floor => -1 Concurrent.physical_processor_count => 6 Concurrent.processor_count => 12 Etc.nprocessors => 12 Concurrent.cpu_quota.floor => -1

according to docs cpu.cfs_quota_us = -1: Unlimited CPU time. cpu.cfs_quota_us = 0: No CPU time (tasks are blocked). cpu.cfs_quota_us > 0: Limited CPU time as specified.

it looks like a bug in ruby-concurrency/concurrent-ruby#1038 https://github.com/ruby-concurrency/concurrent-ruby/pull/1038

— Reply to this email directly, view it on GitHub https://github.com/grosser/parallel/pull/348#issuecomment-2275859126, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAACYZ7XFZY7ZRDIYDF5QRDZQNYH7AVCNFSM6AAAAABMFRSALOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENZVHA2TSMJSGY . You are receiving this because you modified the open/close state.Message ID: @.***>

ahorek commented 2 months ago

You can fall back to using Etc.nprocessors if Concurrent.available_processor_count returns an invalid value, such as -1 (negative) or 0.

grosser commented 2 months ago

@y-yagi any idea why we'd get negative numbers ? ... and thoughts on if we should discard them to use Etc or what they mean ?

more feedback in https://github.com/grosser/parallel/issues/349 ... revert, yanked 1.26.0 (and released 1.26.1 so local caches get purged too)

y-yagi commented 2 months ago

@ahorek Thanks for your feedback!

@grosser As @ahorek described, I think this is a bug of concurrent-ruby. I will try to fix it first.

eregon commented 2 months ago

Concurrent.available_processor_count is now fixed upstream and released: https://github.com/ruby-concurrency/concurrent-ruby/pull/1060 Sorry for the bug!

grosser commented 2 months ago

re-released as v1.26.2 🤞