threefoldtecharchive / jumpscale9_lib

Apache License 2.0
0 stars 0 forks source link

grid capacity: create algorithm to compute percentage of uptime over a month #40

Open rkhamis opened 5 years ago

rkhamis commented 5 years ago

Issue migrated from [https://api.github.com/repos/Jumpscale/lib9/issues/298](), opened by @zaibon

The capacity database should be able to give the percentage of uptime of a node over a month.

Proposal:

then we can compute the percentage of uptime like:

expected_uptime = tX - tX-1 #amount of second between 2 uptime update
if actual_uptime < expected_uptime:
   uptime += expected_uptime - actual_uptime
elif actual_uptime > expected_uptime
    uptime += expected_uptime # this is a strange case, not sure this could happens, except if the node tries to fake is uptime
else:
   uptime += actual_uptime

# then at the end of the month
percentage = uptime / (number_of_second_in_this_month)
rkhamis commented 5 years ago

commented by @zaibon This actually won't work, cause this will only check if the node is up, not if it is actually reachable over the network. So we need an external monitoring tool that check that the node is up and reachable

rkhamis commented 5 years ago

commented by @muhamadazmy @zaibon I think what we need is more of a heartbeat. Not sure though how this should be done for millions of machines without a centralized monitor.

rkhamis commented 5 years ago

*commented by @muhamadazmy

Suggestion 1

This suggestion defined new terms that are not really part of the system at the moment as u can see, which we can discuss further. This includes the (threefold backbone, or infra structure nodes) and the local monitoring node(s) for a region. *

rkhamis commented 5 years ago

*commented by @zaibon I like this idea. I don't think we need to do the aggregation per location just yet. The grid is small enough for all the node to send their own heartbeat to the "monitor node'

@muhamadazmy can you elaborate on how you would design the heartbeat and the monitoring node please*

rkhamis commented 5 years ago

*commented by @muhamadazmy If we won't need aggregation per location just yet, we can actually use the same capacity registration endpoint we have now https://capacity.threefoldtoken.com

My idea is as follows:

Me and @zaibon did not agree if we should a push or pull mechanism to collect the capacity heartbeat though.

A more serious problem with either approaches is the trusting the reported capacity. Since the protocol is open anyone can start registering a fake capacity. We can of course ignore this issue for later but may be it's a good idea if we at least figure out a plan so is considered before designing the capacity reporting/heartbeats. *

rkhamis commented 5 years ago

*commented by @zaibon let's stick with the push approach, we have already all the infrastructure in place to do this ok so if we do that path, here are the task that I can already see we'll have to do:

rkhamis commented 5 years ago

*commented by @andhartl Just look at my farm Maisaval 2: It has the correct geo ip on the node view but no location on the farmer view. There should be the same location on the farmer view. And if it is the wrong one you can correct it on the farmer view manually.

On 21. Jul 2018, at 06:12, Christophe de Carvalho notifications@github.com wrote:

Well if it's not show in the farmer page, that means that the farm has no location set. The node are automatically located using geo-ip. But the farmer can overwrite that location if needed.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Jumpscale/lib9/issues/298#issuecomment-406769120, or mute the thread https://github.com/notifications/unsubscribe-auth/ARutKLFN1HQfLGNVk7OhT0qwDodV4BlGks5uIqpHgaJpZM4UTQsa.

*

despiegk commented 5 years ago

think we are making this too complex, for now measuring local is ok ! just uptime measured when the node is up, independent of connection to internet, we can fix that one later