FOGProject / fogproject

An open source computer cloning & management system
https://fogproject.org
GNU General Public License v3.0
1.09k stars 221 forks source link

Add the ability to set the max-wait globally for all partitions #521

Closed magelfik closed 1 year ago

magelfik commented 1 year ago

Hi there,

We've had problems with multicast since we deployed fog. I believe I finally found the problem :

https://github.com/Magelfik/fogproject/edit/master/packages/web/lib/service/multicasttask.class.php#L659

The timeout is currently only applied for the first partition and then it defaults to 10 secondes. We have images where 10 seconds is not sufficient (probably because of size), and syncing takes more than 10 secondes. This results in fogged hosts being dropped mid-multicast.

My MR suggests a correction https://github.com/FOGProject/fogproject/pull/520

Please let me know if this seems like a sound solution. My school lab would be very grateful to see it patched into the latest fog release.

Sebastian-Roth commented 1 year ago

@Magelfik I remember times when FOS used to wait the same amount of time at the start and in between each partition. Back then I was fed up with waiting (in cases where one of the machines didn't make it at all) and asked the developers to change it to wait 10 seconds between partitions. Can't find the forum topic anymore.

We should definitely change this value if it's causing problems.

Sebastian-Roth commented 1 year ago

@George1422 @wayneworkman I'd appreciate a comment from your side on this as well. Usually the wait time between partitions does not play a role if all machines catch up with the multicast session. But if one drops out the wait time between partitions will be way longer (10 minutes by default but can be configured in FOG Settings in the web UI).

@Magelfik What about making it $maxwait * 6 to make it default to one minute in between partitions? I know it's kind of a crude logic.

magelfik commented 1 year ago

@George1422 @wayneworkman I'd appreciate a comment from your side on this as well. Usually the wait time between partitions does not play a role if all machines catch up with the multicast session. But if one drops out the wait time between partitions will be way longer (10 minutes by default but can be configured in FOG Settings in the web UI).

What I’ve seen from the source code is that it’s 10 seconds, not minutes. Fog accepts minutes, and then converts them into seconds by a simple 60. However the time between partitions is hardcoded to 10, not minutes but seconds (the 60 is not applied). The default 10minutes is only applied for the first sync, which is before the first partition (when all fogged pcs are booting). Perhaps I misunderstood ?

@Magelfik What about making it $maxwait * 6 to make it default to one minute in between partitions? I know it's kind of a crude logic.

I am up for anything. Right now we’ve removed the ternary operator and multicast works fine. But I understand waiting for all computers to sync up between each partition can be tedious. 60 seconds should be fine for most cases, but perhaps it’s worth removing the currently hard-coded value and adding a config option somewhere for the future. Anyway, I’m more than happy to see it go from 10 to 60 :)

Sebastian-Roth commented 1 year ago

@Magelfik said:

... Perhaps I misunderstood ?

I did a double post. Please re-read my answers and I am sure you understand why I said something about 10 minutes to be the default even between partitions (that was in the past...).

magelfik commented 1 year ago

Ah yes, I see. Sorry about that.

Sebastian-Roth commented 1 year ago

@Magelfik Sorry for the delay. I would really like to get this merged in. Are you fine with using $maxwait * 60 wait time for the first partition and $maxwait * 6 for the remaining partitons? If yes, then please update the pull request and I'll merge it in. As well you might increase the last part of the version number in system.class.php by one.

magelfik commented 1 year ago

No problem.We had trouble with other values than my PR, so I would suggest keeping my changes as they are.I will update the version as well.Le 12 déc. 2022 à 20:40, Sebastian-Roth @.**> a écrit : @Magelfik Sorry for the delay. I would really like to get this merged in. Are you fine with using $maxwait 60 wait time for the first partition and $maxwait * 6 for the remaining partitons? If yes, then please update the pull request and I'll merge it in. As well you might increase the last part of the version number in system.class.php by one.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

Sebastian-Roth commented 1 year ago

@Magelfik I am not sure it's wise to use $maxwait * 60 between partitions as well. If just one client can't join the multicast session the process will be delayed by $maxwait * 60 * ($partition_count-1) (10 minutes per partition by default).

magelfik commented 1 year ago

That sounds a bit overkill. Okay, well, one minute seems too short on a bad day. Perhaps two (or three at most) ? ––––––––––––––––––––– Théo PIRKL For confidential matters, please use the following GPG key : Théo Pirkl < @.***> (157DFF10B677DFC0)

Le lun. 12 déc. 2022 à 21:50, Sebastian-Roth @.***> a écrit :

@Magelfik https://github.com/Magelfik I am not sure it's wise to use $maxwait

  • 60 between partitions as well. If just one client can't join the multicast session the process will be delayed by $maxwait 60 ($partition_count-1) (10 minutes per partition by default).

— Reply to this email directly, view it on GitHub https://github.com/FOGProject/fogproject/issues/521#issuecomment-1347306083, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACYZGW4QFCONOOVNVGKYAVTWM6F3HANCNFSM6AAAAAASOK3QOE . You are receiving this because you were mentioned.Message ID: @.***>

Sebastian-Roth commented 1 year ago

@Magelfik Thanks for your reply. Are you keen to update the PR #520 or should we rather commit a change?

Okay, well, one minute seems too short on a bad day.

Well, in that case you can still increase the wait time through FOG Settings (Multicast Settings section -> UDPCAST MAXWAIT = $maxwait).

I vote for using $maxwait * 6 or $maxwait * 10 at most in between partitions.

magelfik commented 1 year ago

I should be able to update #520 this week.

$maxwait * 6 sounds more than fair.

If this doesn't work, we'll know where this is coming from anyway. ––––––––––––––––––––– Théo PIRKL For confidential matters, please use the following GPG key : Théo Pirkl < @.***> (157DFF10B677DFC0)

Le mer. 14 déc. 2022 à 10:03, Sebastian-Roth @.***> a écrit :

@Magelfik https://github.com/Magelfik Thanks for your reply. Are you keen to update the PR #520 https://github.com/FOGProject/fogproject/pull/520 or should we rather commit a change?

Okay, well, one minute seems too short on a bad day.

Well, in that case you can still increase the wait time through FOG Settings (Multicast Settings section -> UDPCAST MAXWAIT = $maxwait).

I vote for using $maxwait 6 or $maxwait 10 at most in between partitions.

— Reply to this email directly, view it on GitHub https://github.com/FOGProject/fogproject/issues/521#issuecomment-1350675826, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACYZGW7IIVAG2OULZP6HIYLWNGEPTANCNFSM6AAAAAASOK3QOE . You are receiving this because you were mentioned.Message ID: @.***>

magelfik commented 1 year ago

Welp, life got in the way.

I'll try to find some time tomorrow to push those changes. Sorry :( ––––––––––––––––––––– Théo PIRKL For confidential matters, please use the following GPG key : Théo Pirkl < @.***> (157DFF10B677DFC0)

Le mer. 14 déc. 2022 à 12:08, Théo Pirkl @.***> a écrit :

I should be able to update #520 this week.

$maxwait * 6 sounds more than fair.

If this doesn't work, we'll know where this is coming from anyway. ––––––––––––––––––––– Théo PIRKL For confidential matters, please use the following GPG key : Théo Pirkl < @.***> (157DFF10B677DFC0)

Le mer. 14 déc. 2022 à 10:03, Sebastian-Roth @.***> a écrit :

@Magelfik https://github.com/Magelfik Thanks for your reply. Are you keen to update the PR #520 https://github.com/FOGProject/fogproject/pull/520 or should we rather commit a change?

Okay, well, one minute seems too short on a bad day.

Well, in that case you can still increase the wait time through FOG Settings (Multicast Settings section -> UDPCAST MAXWAIT = $maxwait).

I vote for using $maxwait 6 or $maxwait 10 at most in between partitions.

— Reply to this email directly, view it on GitHub https://github.com/FOGProject/fogproject/issues/521#issuecomment-1350675826, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACYZGW7IIVAG2OULZP6HIYLWNGEPTANCNFSM6AAAAAASOK3QOE . You are receiving this because you were mentioned.Message ID: @.***>

magelfik commented 1 year ago

Hi @Sebastian-Roth, I have pushed the changes as discussed.

Sebastian-Roth commented 1 year ago

PR #521 merged.