Closed mschnepf closed 2 years ago
@mschnepf I would suggest to wait for @maxfischer2781's review before merging.
@RHofsaess the demand
has a range of 0
to max float
. Usually, a standardized is used to define a range demand >1
to get at least one resource
. If the latency of the resource requests is low, demand
and supply
will be similar. If the latency of the resource requests is high (e.g. long job queue in an HPC center), demand
and supply
will differ more over time.
The change in demand
via the multiplication factor in line 123 will be used when the supply
is bigger than 100. Therefore, the change will be bigger than 1 (change by the other controller) as long as the demand
/ supply
is higher than 10. This should be usually the case.
Haven't forgotten about this, just didn't have time to reply properly yet. Stay tuned.
I do not really get what is the invalid part in the documentation. For me it seems perfectly fine to define this via supply. @mschnepf could you explain what is the issue and what you would expect otherwise? I also don't see an issue with slowly increasing demands that are dependent on maybe slowly increasing/decreasing supply.
The main issue I see is that you might artificially raise/lower the demand completely independent of reality aka supply. In some cases supply will not be able to match the demand at all. What is it, that you intend? Rising priority?
@mschnepf Can you clarify why you think depending on the supply
is wrong? Not needlessly increasing the demand
when the resource cannot actually satisfy it doesn't seem wrong to me.
I'm currently against making this change. The example is in effect a fixed/ratio overbooking; it might or might not apply to the HEP usecase, but it seems like a desirable strategy for many situations.
TARDIS requests more resources when the demand >= supply+size_of_resource
. In the current documentation of the stepwise controller, the demand
is the supply plus an offset. If this offset is smaller than the size of a resource no further resources will be requested. In such a case, the demand
and supply
stay at that level forever.
COBalD config:
from cobald.controller.linear import LinearController
from cobald.decorator.limiter import Limiter
from cobald.decorator.logger import Logger
from cobald.decorator.standardiser import Standardiser
from tardis.resources.poolfactory import create_composite_pool
from cobald.controller.stepwise import *
@stepwise
def control(pool: Pool, interval):
return 10
# additional rules above specific supply thresholds
@control.add(supply=1)
def start(pool: Pool, interval):
if pool.utilisation < 0.45:
return pool.supply - 1
elif pool.allocation > 0.9:
return pool.supply + 1
@control.add(supply=200)
def linear(pool: Pool, interval):
if pool.utilisation < 0.8:
return pool.supply - 1
elif pool.allocation > 0.9:
return pool.supply + 1
pipeline = control.s(interval=1) >> Limiter.s(minimum=1) >> Logger.s(name='changes') >> create_composite_pool('tardis_config.yml')
TARDIS config:
BatchSystem:
adapter: FakeBatchSystem
allocation: 1.0
utilisation: 1.0
machine_status: Available
Sites:
- name: test-site
adapter: FakeSite
quota: 1000
test-site:
StartupCommand: startVM.py
api_response_delay: !RandomGauss
mu: 0.1
sigma: 0.0
resource_boot_time: !RandomGauss
mu: 1
sigma: 0
MachineTypes:
- short
MachineTypeConfiguration:
short:
Walltime: '1:00:00'
delay: 2
MachineMetaData:
short:
Cores: 8
Memory: 110
Disk: 100
I think the basic example in that documentation should be with demand
, similar to the liner controller. In addition, an example with supply can be shown.
TARDIS requests more resources when the
demand >= supply+size_of_resource
. In the current documentation of the stepwise controller, thedemand
is the supply plus an offset. If this offset is smaller than the size of a resource no further resources will be requested. In such a case, thedemand
andsupply
stay at that level forever.
One short question: In the context of TARDIS this might be perfectly valid. But this is the documentation of COBalD explaining the basic concept of the LinearController
. For this purpose I don't see any issues.
You might consider extending the documentation of TARDIS instead.
Keeping in mind the motivation for this change, would it be a compromise just to suggest a larger supply
increase? The initial idea of the example was to show "10% or at least 10", not the "10% or at least 1" that actually made it into the docs.
@control.add(supply=10)
def quantized(pool: Pool, interval):
if pool.utilisation < 0.5:
return pool.supply - 10
elif pool.allocation > 0.5:
return pool.supply + 10
TARDIS uses a Standardiser with granularity= number of CPU-cores
. Therefore, further resources are only requested if demand>supply+granularity
. In the case of the shown example of the stepwise controller, the demand would not increase if the granularity
is bigger than 1 while the supply
is below 200. I created a PR for TARDIS https://github.com/MatterMiners/tardis/pull/209 which removes the granularity
per default.
Therefore, I'm fine with the original example of the stepwise controller. Is a comment at the standardized useful to describe this behaviour?
The
supply
is defined by the resources and is usually constant, as it is in the current documentation. Therefore, it can happen that thedemand
increase very slow. It is better to set the newdemand
based on the current demand.