Open jairbubbles opened 7 years ago
After discussing with a colleague here is @ffulin aswer:
The workers request 1 additional job in order to parallelize network traffic vs compilation. They only actually start the correct number of jobs at the same time. Without this, one CPU core would always be idle for some portion of the time. On the FASTBuild side we unfortunately don't have enough information to differentiate this extra job versus the "normal" ones, and so when we write the monitor log, we we just have to write the same info. So everything is actually working correctly under the hood, but the monitor is "wrong".
It will likely require changes to the network protocol to have enough information to differentiate this "extra" job in the monitor UI.
Yup! That's right :) Something to fix in the future. BTW you seem to have some interesting timeouts...
Now I know what orange means :-)
If you hover the cursor on top of it, it should say "Timeout" instead of Success of Fail.
I do still think the way to fix this is a protocol change. The worker could request an "extra" job explicitly (instead of just a normal nth job), and would also need to notify the client when it actually starts the job. Probably not that tricky to do.
The current dev branch has several fixes that might improve your network timeouts (due to be released in v0.94)
That's what I was thinking too. I deliberately wanted to minimize the changes to FASTBuild to get the first version in :).
If you agree, I'll take a look into it and do a second pass with the protocol changes as you mention. This will potentially allows to add more useful info to the viewer like transfer time in/out vs real cpu time. etc...
My PC is working for another one and sharing 6 cores:
But on the other pc fastbuild monitor displays 7 cores:
But the total he's 18 as expected (12+6).
The weird thing is that the monitor displays 7 cores correctly (each one with its own jobs).