Open tomkerkhove opened 2 years ago
I'm leaning towards using the Green Web Foundation's Go SDK but open to thoughts
I like this idea!
A couple of questions:
- How does the carbon API help? If all the deployments are in the same data center/same geo, does carbon intensity change the anything in compute? It looks to me the best use case is the multi-cluster, multi data center case (see some prior study here )
It depends on the workload but some "secondary"/low-prio workloads can just be scaled down in a given geo if the impact on the environment is too high. This is not specifically a multi-cluster scenario.
- Does it make sense to schedule every deployment or only selectively pick those that have a high energy consumption, thus high carbon impact ones? That brings the question of how to measure the energy consumption of deployments.
That's up to the end-user; the triggers are specific to a ScaledObject and thus on a per-workload basis. So it's up to you to choose what makes sense and what does not.
For example, workloads that require GPU could be scaled down while lesser consuming workloads can continue to run.
@tomkerkhove Thanks for adding this and improving the design. So we can implement this scaler sooner and combine with more scalers once the OR support is in place.
I'm leaning towards using the Green Web Foundation's Go SDK but open to thoughts
Of course we'd love it if you can use the SDK!
Just a heads up that we need to make some breaking changes in https://github.com/thegreenwebfoundation/grid-intensity-go/issues/44 to be able to support more providers of carbon intensity data.
I'm working on the changes and they should be done soon. So I hope they won't be disruptive.
Thanks for adding this and improving the design. So we can implement this scaler sooner and combine with more scalers once the OR support is in place.
Correct!
Just a heads up that we need to make some breaking changes in https://github.com/thegreenwebfoundation/grid-intensity-go/issues/44 to be able to support more providers of carbon intensity data.
I'm working on the changes and they should be done soon. So I hope they won't be disruptive.
Good to know, thanks for sharing! If you are contributing to the SDK; are you willing to contributing the scaler as well?
Hi @rootfs, I've been working with @ross7 on the grid intensity go SDK thing. I've tried to provide some more background to the answers
How does the carbon API help? If all the deployments are in the same data center/same geo, does carbon intensity change the anything in compute? It looks to me the best use case is the multi-cluster, multi data center case (see some prior study here )
The above example works by moving workloads geographically (as in, it moves them through space).
You can also move workloads temporally (as in move them through time).
The carbon intensity changes based on the time of day, so the same workload run at different times will have different emissions figures.
The issue referred to one paper titled Let's Wait Awhile: How Temporal Workload Shifting Can Reduce Carbon Emissions in the Cloud, and it's a fun read, going into this in more detail.
Last month at the recent ACM SIGEnergy workshop, there was a talk from some folks at VMware sharing some new findings, called Breaking the Barriers of Stranded Energy through Multi-cloud and Federated Data Centers. It's really worth a watch but this quote from the abstract gives an idea of why the time element is worth being able to act upon:
many computation workloads (such as some learning or big data) can be flexible in time (scheduled for delayed execution) and space (transferred across any geographical distance with limited cost). This opens the possibility of shifting workloads in time and space to take advantage in real time of any amount of excess renewable energy, which otherwise would be curtailed and wasted. Initial results show that a single datacenter that time shifts load can reduce its emissions by 19% or more annually
There's also some work by Facebook/Meta, where they have shared some results from using this same carbon aware workload scheduling as part of their sustainabilty strategy - see their recent carbon explorer repo. I think they might use their own scheduler, rather than Kubernetes, but the principle is the same - move work through space to make the most of cheaper green energy for your compute.
Does it make sense to schedule every deployment or only selectively pick those that have a high energy consumption, thus high carbon impact ones? That brings the question of how to measure the energy consumption of deployments.
For the suitability question, that's down to the person running the cluster, and the job. Some jobs are better fits for moving through time (low latency, pause-able jobs), and some jobs better for moving through space (ones that don't have to be run within a specific jurisdiction). These are somewhat independent of the energy consumption. If you're curious about the the energy consumption part, I think Scaphandre provides some numbers you can use and labelling of jobs for k8s, and this piece here from the BBC gives an example of it in use.
Hope that helps!
If you are contributing to the SDK; are you willing to contributing the scaler as well?
@tomkerkhove Yes definitely, I'd like to contribute the scaler. We need to finish up the SDK changes and some other dev but I should be able to start on this later in the month.
After discussing with @vaughanknight & @yelghali I've noticed that my proposal for just having a trigger does not make much sense because it will scale straight from min to max replicas given the emission does not change that often.
Instead, I'm wondering if we should not make this part of the ScaledObject
/ScaledJob
definition as a whole similar how we handle fallback:
Imagine the following:
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: {scaled-object-name}
spec:
scaleTargetRef:
name: {name-of-target-resource} # Mandatory. Must be in the same namespace as the ScaledObject
maxReplicaCount: 100 # Optional. Default: 100
environmentalImpact:
carbon:
- measuredEmission: 5%
allowedMaxReplicaCount: 50
- measuredEmission: 10%
allowedMaxReplicaCount: 10
fallback: # Optional. Section to specify fallback options
failureThreshold: 3 # Mandatory if fallback section is included
replicas: 6 # Mandatory if fallback section is included
triggers:
# {list of triggers to activate scaling of the target resource}
This allows end-users to define how their application should scale based on its needs by defining triggers. If we have to control how it should adapt based on the carbon emission, then the can define measuredEmission
and its corresponding allowedMaxReplicaCount
.
So if the emission is 5%, then the maximum replicas of 100 is overruled to 50 and:
If the emission is lower than 5%, then it will go back to 100 max replicas.
Any thoughts on this @rossf7 / @zroubalik / @jorturfer?
Should we do this instead of a carbon aware scaler? No. But I think that one only makes sense once we do https://github.com/kedacore/keda/issues/3567 and with the above proposal we don't need a trigger for it anymore.
I think it would make sens to have both features one small point about "Carbon Awareness":
The Info provided by the APIs (WattTime, ElectricityMap, etc.) is Electricity Carbon Intensity (I) in gCO2eq/KHw. e.g how much carbon is contained in the Electricity / Power we are using now ; and it is an external metric (independent of the workloads themselves)
Energy or power consumed by a workload (or a pod) (E) in Khw
Carbon emissions of the pod = *(E I) in gCO2eq**
A proposal for using both the "Core Carbon Awareness proposed above" and the "Carbon Aware Trigger"
the core Awareness feature: is about"Electricity Carbon Intensity" and would control the MaxReplicaCounts --> how far we scale to, depending on the external envirionment / Electricity carbon intensity (I) c.f the proposal you suggested @tomkerkhove
"the Carbon Aware Scaler / trigger", would be a "proper" scaler and would scale replicas based on pod power (E): e.g scale by 1 when average power of pod, is greater than x
later the "Carbon aware scaler" could also scale (In or OUT) based on pod or workload carbon emissions (gCO2eq) as it would be another internal metric
c.f https://github.com/intel/platform-aware-scheduling/tree/master/telemetry-aware-scheduling/docs/power#5-create-horizontal-autoscaler-for-power --> in this project, power, heat are considered as other metrics (triggers) CPU, Ram, etc.
In terms of adoption, I think the "Core Carbon Awareness" is simpler to adopt because it does not require the customers / companies to have "power telemetry available" (which only a few customers have, as of now).
On the other hand "Carbon Aware Scaler" is also interesting because it offers actual Power / Carbon Metrics for the workloads. and It would fit with the AND / OR logic with other scalers.
ps: a suggestion for the fields / usage for the "Core awareness feature"
rename measuredemissions to electricity_intensity_gco2_per_kwh : as i think "measured emissions" would be related to carbon emissions (gCO2eq)
for the value, I think user should have to option to set a value in addition to the % : because from what i've seen the APIs provide a number for the CarbonIntensity (e.g 34) ; using the % is more interesting but i think it takes more work to implement.
carbon:
- "the Carbon Aware Scaler / trigger", would be a "proper" scaler and would scale replicas based on pod power (E): e.g scale by 1 when average power of pod, is greater than x
- later the "Carbon aware scaler" could also scale (In or OUT) based on pod or workload carbon emissions (gCO2eq) as it would be another internal metric
- c.f https://github.com/intel/platform-aware-scheduling/tree/master/telemetry-aware-scheduling/docs/power#5-create-horizontal-autoscaler-for-power --> in this project, power, heat are considered as other metrics (triggers) CPU, Ram, etc.
We can add this but before we start building scalers we'd need to be sure what they look like though as once a scaler is added we can't simply introduce breaking changes.
However, if my above proposal is agreed on then we can open a separate issue for it.
ps: a suggestion for the fields / usage for the "Core awareness feature"
- rename measuredemissions to electricity_intensity_gco2_per_kwh : as i think "measured emissions" would be related to carbon emissions (gCO2eq)
I think this is something we can document as details though, no need to be that verbose IMO. We can rename it to measuredIntensity
though.
agreed, the scaler can be the next step. the proposal above has value and would be easy to build
Imagine the following:
apiVersion: keda.sh/v1alpha1 kind: ScaledObject metadata: name: {scaled-object-name} spec: scaleTargetRef: name: {name-of-target-resource} # Mandatory. Must be in the same namespace as the ScaledObject maxReplicaCount: 100 # Optional. Default: 100 environmentalImpact: carbon: - measuredEmission: 5% allowedMaxReplicaCount: 50 - measuredEmission: 10% allowedMaxReplicaCount: 10 fallback: # Optional. Section to specify fallback options failureThreshold: 3 # Mandatory if fallback section is included replicas: 6 # Mandatory if fallback section is included triggers: # {list of triggers to activate scaling of the target resource}
This allows end-users to define how their application should scale based on its needs by defining triggers. If we have to control how it should adapt based on the carbon emission, then the can define
measuredEmission
and its correspondingallowedMaxReplicaCount
.So if the emission is 5%, then the maximum replicas of 100 is overruled to 50 and:
* An event is tracked on the ScaledObject * HPA is updated to use 50 as max replica * CloudEvent is emitted
If the emission is lower than 5%, then it will go back to 100 max replicas.
We need to keep in mind the KEDA users are exposing their services to end-users. The end-user, at the end, wants quality of service (shareholders too). We can justify a lower quality of service for a certain period of time, but the service needs to be usable. So, limiting the number of replicas to a fixed value does not seem appropriate to me at all.
It would seem more relevant to me to apply a relative decline to the scaling rule, not in Absolute replica count (Pod)
but in Relative replica count (%)
.
Imagine the following:
In that example:
spec:
...
environmentalImpact:
carbon:
- measuredIntensity: 400
reducedReplicaPercent: 50%
- measuredIntensity: 200
reducedReplicaPercent: 25%
- measuredIntensity: 50
reducedReplicaPercent: 10%
triggers:
...
A proposal to donate AKS's carbon aware operator is open on https://github.com/kedacore/keda/issues/4463
Proposal
Provide a carbon aware scaler that allows end-users to scale based on their impact on the environment.
As per @rossf7 on https://github.com/kedacore/keda/issues/3381:
Also:
Use-Case
Automatically scale workloads out while the impact on the environment is low, scale in if the impact is too high.
This is useful for batch-like workloads.
Anything else?
Relates to https://github.com/kedacore/keda/issues/3381 Related to our collaboration with the Environmental Sustainability TAG/WG (https://github.com/kedacore/governance/issues/59)