FOGProject / fogproject

An open source computer cloning & management system
https://fogproject.org
GNU General Public License v3.0
1.1k stars 221 forks source link

[dev branch] multicast job end does not clean Active Multicast Tasks entry #495

Closed brnnrc closed 1 year ago

brnnrc commented 1 year ago

Version 1.5.9.174 (1 server and 4 master storage node)

when a multicast job ends, the entry in Active Multi-cast Tasks web page doesn't go away (sticks to "In progress" status). The job seems correctly accomplished. The multicast.log.unpcast.xx log is deleted.

I can remove the task manually indeed.

No other side effect noticed. thanks for you efforts Enrico

Sebastian-Roth commented 1 year ago

@brnnrc Thanks for using dev-branch and reporting this issue. I can imagine an issue with PHP 8.

Please send in some more details. Which Linux OS and Version do you use? Also check Apache and PHP-FPM logs on the server so see if we have some obvious error message pointing us the right way.

brnnrc commented 1 year ago

Hi, thanks for your reply. linux is debian 11, php 7.4.30 . No relevant info from Apache and PHP-FPM logs. I suppose the storage node is not sending out the "end of job" to db (?). Let me know how to dig.

Sebastian-Roth commented 1 year ago

@brnnrc said:

1 server and 4 master storage node

Now that I read this again I am not sure what you mean by that. Can you please tell us more about your exact setup? Nodes are usually either a normal node (hosting the DB) or a storage node (connecting to a DB on a normal node). Though you could also create storage nodes and make those to be the master within their own storage group. Probably best if you can post a picture of the storage nodes and storage groups view as well as information about the storage nodes being in different subnets.

Do you use the location plugin as well?

Sebastian-Roth commented 1 year ago

@brnnrc Another question arises. Do you follow the multicast task on the client screens? Right at the end it should print a line "Updating Database ....". Does this succeed or fail?

brnnrc commented 1 year ago

sorry for the misunderstanding, my setup is pretty simple: 1 normal node and 4 storage nodes (they are master of their own storage group (they are alone in own group)). all storage node are in different subnetworks and they can reach the normal node (and the db too). The same structure works (and the multicast task vanish when completed) on fog-project 1.5.4... I don't use the location plugin.

To reply to your last question... good question! I dont know, I'm going to try, but the distinct lines (one for each pc to deploy) on "Active tasks" web page list correctly vanish when the job ends. The entry in the "Active Multicast Task" web page list does not disappear (See screenshoot) Selection_024

Sebastian-Roth commented 1 year ago

@brnnrc I can seem to replicate the issue so far. Setup as described and when the test clients finish both the active tasks and the multicast task disappear. Though this is a very small test setup using two VMs. Not sure if there is a timing race condition causing this in your case.

Please run journalctl -u FOGMulticastManager.service -b on the storage node used and post full output here.

brnnrc commented 1 year ago

Hi, after you test on vm (I had to made myself, sorry), I tried on another group and it works. So, I'm going to check the "failing" group. I'll give news on next week. Thanks for your time

Enrico

Il giorno ven 30 set 2022 alle ore 14:38 Sebastian-Roth < @.***> ha scritto:

@brnnrc https://github.com/brnnrc I can seem to replicate the issue so far. Setup as described and when the test clients finish both the active tasks and the multicast task disappear. Though this is a very small test setup using two VMs. Not sure if there is a timing race condition causing this in your case.

Please run journalctl -u FOGMulticastManager.service -b on the storage node used and post full output here.

— Reply to this email directly, view it on GitHub https://github.com/FOGProject/fogproject/issues/495#issuecomment-1263520995, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFNA3AXIPWUGEISLQJZGXF3WA3NLRANCNFSM6AAAAAAQZRA4KA . You are receiving this because you were mentioned.Message ID: @.***>

brnnrc commented 1 year ago

big shame on me! I forgot to upgrade the "failing" storage node fog version from 1.5.9.154 to 1.5.9.174. The normal node and 2 working storage nodes are on 1.5.9.174... Upgraded the missing node... Problem solved. My best apologies! Enrico

brnnrc commented 1 year ago

closed

Sebastian-Roth commented 1 year ago

@brnnrc Thanks for clearing up what was causing the issue. Though I am still wondering what change we had between 1.5.9.154 and 1.5.9.174 that could cause this behavior. Anyway, great this is solved.