Open dirksammel opened 6 days ago
Hi @dirksammel, are you using the Standardiser/Limiter decorator to limit the demand or how do you set the anticipated number of drones running in parallel?
Yes, we're using the Limiter:
pipeline:
- !RelativeSupplyController
low_utilisation: 0.75
high_allocation: 0.75
high_scale: 1.5
low_scale: 0.9
- !Limiter
minimum: 40
maximum: 1000
- !Logger
name: 'changes'
- !TardisPoolFactory
configuration: 'tardis.yml'
We see in our logs that the demand sometimes gets larger (or smaller) than the configured value due to some floating-point errors, for example:
Sep 11 14:46:36 arc3.bfg.uni-freiburg.de docker-COBalD-Tardis-atlprd[223200]: nemo_tardis_c40m100: 2024-09-11 12:46:36 demand = 1000.0000000000001 [demand=1000.0000000000001, supply=880.0, utilisation=0.78, allocation=0.78]
In consequence, a new drone is started and an old one begins to drain. If the draining takes some time and the floating-point issue happens again during this time, more drones than anticipated will run in parallel.