threefoldtech / home

Starting point for the threefoldtech organization
https://threefold.io
Apache License 2.0
9 stars 4 forks source link

Power management (NEEDS TO BE DELETED) #1303

Closed xmonader closed 1 year ago

xmonader commented 2 years ago

with the current energy prices, we need to find away to turn off nodes, and still avoid abuse

the current favored solution is using wake-on-lan however, this requires some enthronements e.g the farms need to be location based, physically in the same lan and the farms need to provide some hot capacity always available for the provisioning and the remaining can be cold capacity that are subject to random turnon/off procedures

issues

delandtj commented 2 years ago

Well I was going to create the issue here, but I'll add comments here

delandtj commented 2 years ago

Ability to poweroff nodes in the Grid

Preamble

Energy costs of a node is in all occurences something to take in account, but nowadays and specially in Europe, it is of vital importance for the viablility of the grid itself.
Costs for running a node, a part from networking overhead, is now basicaly the highest investment over 5 years, more so than the investment of the hardware itself. While it doesn't seem much calculated per month, future prices, even when 'the new normal' sets in with a new equilibrium, will be nowhere in the vicinity of the prices set a year ago.

The pursuit of dramatically lowering the energy consumption of the Grid in order to be green(er) now has another incentive: money.

While we all wish all nodes have their cpus, memory and storage maxed out and the world IT community is knocking on all our doors for more, we're not there yet.

Some farms are 'just' online. No workloads. Just generating tokens for the Farmer. In Principle, these nodes represent the investment of the farmer to the size of the Grid, and while some costs are invloved like housing, networking,.. there should be no need for nodes that have no workloads running to be powered on in the first place.

Powering nodes in function of necessity

Requirements

Powering off a node should be straightforward: When a node has no workload at all, it can be shut down.

Question:

Powering on a node will be needed to be done by a smarter provisioning scheme in function of the size of a farmer

We only support powering off nodes in farms that have 1+n nodes in the same network.

One node in a farm will always be on, hosting the poweron service.

For powered off nodes to be off as long as possible, finding a node to deploy a workload becomes a bit more hairy, and we'll need a lot of verification that a recent powered on node is properly started and capable to host workloads.

Powered off nodes will generate the same amount of tokens as if powered on, but need to be regularly (randomly?) powered on to formally ackowledge their existence.

Seen the sheer number of interfaces that PDU brands have, it would be virtually impossible to support them all, so powering off/on should nog be done with PDU. The more, we can't surmise that a farmer always has a pdu for powering his nodes. So no PDU

Technical

delandtj commented 2 years ago

I added 'Technical' that people can fill in implementation details and issues in the concerned repos

maxux commented 2 years ago

Just checked, acpid is not needed, we can handle that via zinit directly :)

muhamadazmy commented 2 years ago

Suggestion on how this can work on ZOS.

note about chain state:

If a node is woken up to find out that it's target state is Down it can simply send the uptime report and go back to sleep automatically. This will make it easier for the power manager to randomly nudge node to proof their existence.

muhamadazmy commented 2 years ago

The idea behind having the power manager send the power off decision to the node although the node just can check its own target state is that this validate that the target node is reachable by the power manager. hence can be walking up again.

This solution will make it okay if u have multiple LANs that join the same farm. a farmer can then have multiple power manager selected in each LAN with no issue.

DylanVerstraete commented 2 years ago

I think we also need to rework the way deployments are created. I think the user needs to have an agreement with a farm rather than a node. Since a user can only use online nodes in a farm, and the user doesn't really know in advance which nodes these are. If we keep supporting the NodeContract(nodeID) a user can create a node contract for a sleeping node in a farm and never have it's workload deployed.

I think the managing node should also act as provisioning manager. The user should be able to create a contract with a Farm and the managing node should see this contract being created and redirect the contract to a node that is able to accept the workload.

Maybe if we keep the NodeContract the chain can actually check if the node is up or down and return an error to the user in case the node is down.

muhamadazmy commented 2 years ago

The user can know the state of the node from the TargetState of the node (Up, or Down) so he is free to choose a node that is UP from the start If a user choose a node that is down the chain can then bring the state of the node up. This will take sometime to bring up fully of course. Hence the user need to know he has to wait until the node is up before talking to the node directly

DylanVerstraete commented 2 years ago

So per discussion with azmy;

We can extend the code on the chain where the create_node_contract takes into account the following things:

Questions

This brings up the question if we actually need to specify a node id on contract creation or actually a farm ID. If we provide a farm ID and Resources to contract creation the chain can select the node for the user.

This also is in contrast with the proposed solution for capacity planning here: https://github.com/threefoldtech/home/issues/1304#issuecomment-1245225199

LeeSmet commented 2 years ago

The main problem I have with a farmer elected manager is that this introduces a single point of failure in the design. On the contrary, if the logic to select nodes to poweroff is idempotent, no single central manager is required. Depending on farm size, multiple nodes can be left operations, which can then use a slot based leadership system to decide who will handle which events.

DylanVerstraete commented 2 years ago

@LeeSmet what about the capacity planning? What are your thoughts on above comment?

delandtj commented 2 years ago

There are two types of events:

Another thing: naming is important: we already have down as not reachable, or not available in any way. Shouldn't we call it 'sleeping' or something like that ?

delandtj commented 2 years ago

Contracts per farm, indeed, that way no-one can generate workloads like Network Resources just to start all nodes in a farm

muhamadazmy commented 2 years ago

@delandtj

If we enforce the rule that a single farm need to exist on the same LAN then indeed we can drop most of the complexity nodes can listen to their own power off signal and make the grid the solo manager of the power management. Bringing a node up should then generate an event that can be picked up by all nodes in the same farm, hence they all can generate the magic packet to wake up their sleeping friend.

We still need then to discuss how the grid gonna decide what nodes need to go to sleep, and on what conditions it can bring them up again.

Also regarding having the deployment contract with the farm itself, and not the node. The grid then need to still select a node and assign it to the contract (and possibly brig it up) which means capacity planning entirely has to happen on the chain (which i don't mind if we already have all the data). Once node is selected and assigned to a contract. The user then need to "wait" until the node status if fully up before he can contact the node to actually deploy his stuff.

Those changes combined (imho) are a major change to the grid (hence a new major version?)

scottyeager commented 2 years ago

Only nodes that support TPM will be able to be powered off.

What's the thinking behind this requirement @delandtj?

muhamadazmy commented 2 years ago

@delandtj @DylanVerstraete and @LeeSmet we really need to agree on the final approach to be able to create the related (technical) issues. Could you please read my previous comment, and comment if this (technical wise) is good?

DylanVerstraete commented 2 years ago

Looks good yes. I only think the user experience will get worse with this power management feature. If the user wants to deploy on a farm that needs to boot a node in order to host his workload then he possible will have to wait for like 5-10 minutes..

delandtj commented 2 years ago

So in a nutshell: what do we convene over this ? I mean, we need to set in stone also what the implementation details will be.

delandtj commented 2 years ago

Looks good yes. I only think the user experience will get worse with this power management feature. If the user wants to deploy on a farm that needs to boot a node in order to host his workload then he possible will have to wait for like 5-10 minutes..

This can be messaged

muhamadazmy commented 2 years ago

Okay, i will try to write down a dump of all changes that are required based on our meeting regarding capacity planning with power management: Since nodes will be sleeping, a user can not choose a node to deploy, it's up to the grid to find the most suitable node with the option of bringing nodes up if needed.

Creating a contract

Deployment

Once the contract node id is set, the user is ready to contract the node to deploy his contract as usual.

Notes

Changes related to capacity management

Changes needed to zos.

Changes needed to chain

Note, those changes are related to capacity planning only and not the entire power management story.

muhamadazmy commented 2 years ago

after a little discussion with @DylanVerstraete we agreed on the following: To improve events processing, we will also keep a map of contracts that are created (per farm) that still need node-id which means if events stream is interrupted the node can still check that the state of the map was not changed. Contracts that get their node id are removed from the map.

AhmedHanafy725 commented 2 years ago

on the node assigning, IMNSHO it will be needed to be able to deploy on different nodes that for something like kubernetes clusters(it shouldn't be deployed on the same node)

muhamadazmy commented 2 years ago

@AhmedHanafy725 yes, you are right. @rkhamis brought this up during the meeting and I forgot to document it here in the issue. I had a suggestion is to create a special type of contract. can be called ClusterContract. which is basically a set of contracts + a policy. Once created, the capacity planning process will know (based on the policy) that those contracts can not be deployed on the same node then each sub-contract is assigned a new node.

The process can go like this

muhamadazmy commented 1 year ago

We had discussions regarding real life use cases (k8s cluster, and separate network workloads): A contract object will have this new attribute

Use case:

On deploying k8s start by:

xmonader commented 1 year ago

the booting time according to Jan can be between 2-10 mins, which is .. bad. I guess that means the power manager will be the main node to provision resources on , and it needs to automatically boot other nodes when it reaches a specific threshold, but that's also quite cumbersome, e.g someone wants to a node with GPU and there's no reference of GPU on the power manager, meaning, the user may end up waiting 2-10 mins for the VM to boot.

Also, not all nodes are created equal, some could be specialized for cpu, ram, storage, or gpu, some sort of tagging notation might be needed to wakeup the right node(s)

muhamadazmy commented 1 year ago

Iteration over the power management

We need to assume that a single farm can span multiple lans this is an iteration over this comment

Node

each node object has

enum PowerTarget{
    Up,
    Down
}

enum PowerState {
    Up,
    Down(leader_id)
}

struct Node {
    power: Power {
            target: PowerTarget,
            state: PowerState,
        },
    ...
}

General case

nodes can find about all direct neighbors nodes by simply getting information about all nodes in the farm, then try to reach them over the local zos ip. An HTTP service that is only available on local zos interface, the service need to return a signed response this way we can grantee a node is exactly what it claim to be. (to avoid situation where nodes on different segments has the same private IP). In the example above N7 and N4 for example can has the same private IP.

This way each node can learn about it's immediate neighbors that lives on the same segment. For each segment at least single power manager is elected, election is very simple:

Hence in the example above:

Then:

Now back to the example above. Let's assume this farm is completely free of workloads. Grid will decide that it can power off all nodes except the public node (N4). So let's say it sets all nodes target states to Down accept N4 (the public node).

If you follow logic above we will end up with following state:

Notes

brandonpille commented 1 year ago

I have a couple of questions:

muhamadazmy commented 1 year ago

@brandonpille

This does not change the contract reservation and billing cycle. this is solely related to node power cycle. Nothing much changes in the grid except for the "target" and "current" state. and the function to set the current state by the node

DylanVerstraete commented 1 year ago

@muhamadazmy I think Brandon asks if the billing should trigger even if the node is still down (if it for some reason could not be brought up)

brandonpille commented 1 year ago
muhamadazmy commented 1 year ago

billing is related to capacity reservation. which should not exist unless a node target power is up. If a node "current" power is never got to "up" state means something is wrong. and billing probably need to stop may be

muhamadazmy commented 1 year ago

@brandonpille I think yes, the grid should accept creaation of capacity reservation as long as the node target state is up. Normally the current state should follow in few minutes (until the node actually is booted). May be during this time billing should not be done ?

DylanVerstraete commented 1 year ago

@muhamadazmy how are these segments defined?

brandonpille commented 1 year ago

billing is related to capacity reservation. which should not exist unless a node target power is up. If a node "current" power is never got to "up" state means something is wrong. and billing probably need to stop may be

So we only start billing if the power is set to UP?

DylanVerstraete commented 1 year ago

Billing will only trigger 1 hour after creation so it doesn't matter, if the node is still down by then, something is wrong.

brandonpille commented 1 year ago

booted

It would only be fair in my opinion to the user to only start billing when the node is actually UP. One more question. What do we do if the trigger UP event never comes? Do we add a timeout on it?

brandonpille commented 1 year ago

@brandonpille I think yes, the grid should accept creaation of capacity reservation as long as the node target state is up. Normally the current state should follow in few minutes (until the node actually is booted). May be during this time billing should not be done ?

I was talking about the deployment contract. Do we accept it whenever the capacity reservation is created, no matter the state of the node or when the node got UP?

muhamadazmy commented 1 year ago

I think yes. until there is a good reason not to.

despiegk commented 1 year ago

Only nodes that support TPM will be able to be powered off.

What's the thinking behind this requirement @delandtj?

This was a mistake, TPM has nothing to do with WOL I believe

Nelson361 commented 1 year ago

Those us us with a home datacenter will not be able to tolerate servers randomly starting up throughout the night. Nothing is louder than a server during startup. Please do checkups ONLY during daylight hours. Obviously startups can occur for deployments at any time, that is ok.

xmonader commented 1 year ago

could on chain should be finished by 23-11, need couple more days on zos to integrate it, will start the clients updates as soon as possible

xmonader commented 1 year ago

falling behind: requires more reworking https://github.com/threefoldtech/tfchain/issues/536

deadline will be updated after the engineering call today

scottyeager commented 1 year ago

Do we have an updated timeline, @xmonader?

xmonader commented 1 year ago

Do we have an updated timeline, @xmonader?

For chain deployment on devnet we are aiming to happen next tuesday, most of the clients are almost code complete, but they need to be tested against real environments

despiegk commented 1 year ago

close all linked issues we need new power mgmt story