liamkf / Unreal_FASTBuild

Allows UnrealEngine to be built with FASTBuild for VS2015/VS2017 and Windows 10.
MIT License
180 stars 71 forks source link

Unable to find fbuild process #6

Closed BrodyHiggerson closed 6 years ago

BrodyHiggerson commented 8 years ago

Hi there! I realize this is an issue with me, not this repo, but I wasn't sure how else to ask about this.

I've gone through ClxS' changes (apart from to FastBuild itself for now) from https://github.com/ClxS/FASTBuild-UE4, and have used your FASTBuild.cs, but I'm having the following error when I try to build a project compiling against/with the engine:

Exception launching fbuild process. Is it in your path?System.ComponentModel.Win32Exception (0x80004005): The system cannot find the file specified
3>     at System.Diagnostics.Process.StartWithCreateProcess(ProcessStartInfo startInfo)
3>     at UnrealBuildTool.FASTBuild.ExecuteBffFile(String BffFilePath) in L:\_Programming\Repositories\UnrealEngine\Engine\Source\Programs\UnrealBuildTool\System\FASTBuild.cs:line 823

I made sure to put FBuild.exe in the Engine/Intermediate/Build/ folder, which I'm sure it's checking (this is also where the fbuild.bff file ends up). Is there anything I should try/check?

Thanks for your work on this!

liamkf commented 8 years ago

Hi Brody!

You're almost there! :)

So there's two options, the easier one is that you can either change the line new ProcessStartInfo("fbuild", FBCommandLine); to something like new ProcessStartInfo(@"C:\downloads\fbuild\fbuild.exe", FBCommandLine); with the path being to where you fbuild exe is. You could also construct the string to be relative there if you want to store the fbuild.exe somewhere in the Unreal folder structure.

Alternately you can add the location (C:\downloads\fbuild) to your PATH environment variable so it can be found.

Hope this helps! :)

BrodyHiggerson commented 8 years ago

Hi Liam,

Thanks for getting back to me so quick! I ended up going withFBProcess.StartInfo.FileName = BuildConfiguration.BaseIntermediatePath + FBProcess.StartInfo.FileName; before the start-up of the process.

I am currently having trouble discovering any workers, however. Do I just need to run FBuildWorker on machines on the network? Getting No workers available - Distributed compilation disabled even with FBuildWorker running on the local machine itself. I realize that's more an FastBuild problem, not yours, but thought I'd ask!

liamkf commented 8 years ago

No problem! :)

For the distribution I think you're just missing the FASTBUILD_BROKERAGE_PATH environment variable setup on the builder and workers. It should be a shared network folder everyone can see. It's under the worker discover section here: http://www.fastbuild.org/docs/features/distribution.html

BrodyHiggerson commented 8 years ago

Ahh I just found that! It might just be the late time over here, but that's not clicking for me just yet - what form does this take? "Workers signal their availability by writing a token to this location. " is a bit vague for me (although I probably am just missing the meaning of 'token' in this context).

Any advice on this last step would be appreciated.

liamkf commented 8 years ago

Ah you don't need to worry about that! It's just a normal windows environment variable pointing to a network path. The workers and fbuild handle the rest. We used a .bat file to launch all of our workers:

SET FASTBUILD_BROKERAGE_PATH=\\Desktop-Beast\FastBuildShared CALL "FBuildWorker.exe"

What happens is the FBuildWorkers create files under the \Desktop-Beast\FastBuildShared path that FBuild looks for.

BrodyHiggerson commented 8 years ago

Hi Liam,

Thanks for that!

Do you mean to say that this folder to which all build machines point should be a shared network drive/folder? I.e.LocalIPOrNameOfSharedPC\Some\Folder?

EDIT: I managed to get the slave machine to write into the shared folder of the primary machine. Just can't get that primary machine to look at that folder. I'm having some other PC issues that I need to fix first, I think. I did find that my slave machine only starting writing the token once I manually added the environment var to it, rather than via .bat file. Ahh well!

I'll buzz you if I can't get it working on the main machine :) Thanks again!

BrodyHiggerson commented 8 years ago

Hi Liam,

Thanks for all of your help! I've managed to get my primary machine and secondary machine recognized by Fastbuild, but I'm having an issue on the remote machine - it shows "Synchronizing Compiler" for a little while, then I see some errors in the output log. Build fails.

BFF file 'L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\fbuild.bff' has changed (reparsing will occur).
3>  Distributed Compilation : 1 Workers in pool
3>  4> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\ShaderCompileWorker\Module.ShaderCompileWorker.cpp.obj
3>  3> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\ShaderFormatOpenGL\Module.ShaderFormatOpenGL.cpp.obj
3>  5> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\Projects\Module.Projects.cpp.obj
3>  6> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\ShaderFormatD3D\Module.ShaderFormatD3D.cpp.obj
3>  1> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\VulkanShaderFormat\Module.VulkanShaderFormat.cpp.obj
3>  8> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\MetalShaderFormat\Module.MetalShaderFormat.cpp.obj
3>  7> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\SandboxFile\Module.SandboxFile.cpp.obj
3>  2> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\TargetPlatform\Module.TargetPlatform.cpp.obj
3>  5> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\DesktopPlatform\Module.DesktopPlatform.cpp.obj
3>  -> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\Projects\Module.Projects.cpp.obj <REMOTE: 192.168.0.4>
3>  7> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\ImageWrapper\Module.ImageWrapper.cpp.obj
3>  2> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\ImageCore\Module.ImageCore.cpp.obj
3>  6> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\Json\Module.Json.cpp.obj
3>  3> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\ShaderPreprocessor\Module.ShaderPreprocessor.cpp.obj
3>  8> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\ShaderCompilerCommon\Module.ShaderCompilerCommon.cpp.obj
3>  1> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\ShaderCore\Module.ShaderCore.cpp.obj
3>  4> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\RenderCore\Module.RenderCore.cpp.obj
3>  6> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\RHI\Module.RHI.cpp.obj
3>  7> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\Core\Module.Core.7_of_8.cpp.obj
3>  2> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\Core\Module.Core.8_of_8.cpp.obj
3>  8> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\Core\Module.Core.6_of_8.cpp.obj
3>  3> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\Core\Module.Core.5_of_8.cpp.obj
3>  5> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\Core\Module.Core.4_of_8.cpp.obj
3>  1> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\Core\Module.Core.3_of_8.cpp.obj
3>  4> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\Core\Module.Core.2_of_8.cpp.obj
3>  6> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\Core\Module.Core.1_of_8.cpp.obj
3>  8> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\SandboxFile\Module.SandboxFile.cpp.obj <LOCAL>
3>  5> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\TargetPlatform\Module.TargetPlatform.cpp.obj <LOCAL>
3>  3> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\ShaderFormatD3D\Module.ShaderFormatD3D.cpp.obj <LOCAL>
3>  2> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\ShaderFormatOpenGL\Module.ShaderFormatOpenGL.cpp.obj <LOCAL>
3>  1> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\MetalShaderFormat\Module.MetalShaderFormat.cpp.obj <LOCAL>
3>  -> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\VulkanShaderFormat\Module.VulkanShaderFormat.cpp.obj <REMOTE: 192.168.0.4>
3>  -> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\ShaderCompileWorker\Module.ShaderCompileWorker.cpp.obj <REMOTE: 192.168.0.4>
3>  -> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\Json\Module.Json.cpp.obj <REMOTE: 192.168.0.4>
3>  -> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\ImageCore\Module.ImageCore.cpp.obj <REMOTE: 192.168.0.4>
3>  -> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\ImageWrapper\Module.ImageWrapper.cpp.obj <REMOTE: 192.168.0.4>
3>  -> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\ShaderCompilerCommon\Module.ShaderCompilerCommon.cpp.obj <REMOTE: 192.168.0.4>
3>  -> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\ShaderPreprocessor\Module.ShaderPreprocessor.cpp.obj <REMOTE: 192.168.0.4>
3>  -> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\DesktopPlatform\Module.DesktopPlatform.cpp.obj <REMOTE: 192.168.0.4>
3>  -> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\ShaderCore\Module.ShaderCore.cpp.obj <REMOTE: 192.168.0.4>
3>  -> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\RenderCore\Module.RenderCore.cpp.obj <REMOTE: 192.168.0.4>
3>  -> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\RHI\Module.RHI.cpp.obj <REMOTE: 192.168.0.4>
3>  7> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\Core\Module.Core.6_of_8.cpp.obj <LOCAL>
3>  4> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\Core\Module.Core.4_of_8.cpp.obj <LOCAL>
3>  6> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\Core\Module.Core.5_of_8.cpp.obj <LOCAL>
3>  8> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\Core\Module.Core.8_of_8.cpp.obj <LOCAL>
3>  5> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\Core\Module.Core.3_of_8.cpp.obj <LOCAL>
3>  3> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\Projects\Module.Projects.cpp.obj <LOCAL>
3>  8> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\Core\Module.Core.7_of_8.cpp.obj <LOCAL>
3>  2> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\Core\Module.Core.2_of_8.cpp.obj <LOCAL>
3>  1> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\Core\Module.Core.1_of_8.cpp.obj <LOCAL>
3>  3> Obj: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\RHI\Module.RHI.cpp.obj <LOCAL RACE>
3>  PROBLEM: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\ImageWrapper\Module.ImageWrapper.cpp.obj
3>  Failed to build Object (error 0xc0000135) 'C:\Users\Brody\AppData\Local\Temp\.fbuild.tmp\0x00000000\core_1009\Module.ImageWrapper.cpp.obj'
3>  PROBLEM: L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\RenderCore\Module.RenderCore.cpp.obj
3>  Failed to build Object (error 0xc0000135) 'C:\Users\Brody\AppData\Local\Temp\.fbuild.tmp\0x00000000\core_1006\Module.RenderCore.cpp.obj'
3>EXEC : 8> warning : L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\Core\Module.Core.7_of_8.cpp.obj
3>  Module.Core.7_of_8.cpp
3>L:\_Programming\Repositories\UnrealEngine\Engine\Source\Runtime\Core\Private\Windows\WindowsApplication.cpp(2095): warning C4628: digraphs not supported with -Ze. Character sequence '<:' not interpreted as alternate token for '['
3>L:\_Programming\Repositories\UnrealEngine\Engine\Source\Runtime\Core\Private\Windows\WindowsApplication.cpp(2099): warning C4628: digraphs not supported with -Ze. Character sequence '<:' not interpreted as alternate token for '['
3>  --- Most Expensive ----------------------------------------------
3>  Time (s)  Name:
3>  34.604    L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\Core\Module.Core.6_of_8.cpp.obj
3>  28.398    L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\Core\Module.Core.5_of_8.cpp.obj
3>  27.709    L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\Core\Module.Core.3_of_8.cpp.obj
3>  27.225    L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\Core\Module.Core.7_of_8.cpp.obj
3>  25.279    L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\Core\Module.Core.4_of_8.cpp.obj
3>  22.984    L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\MetalShaderFormat\Module.MetalShaderFormat.cpp.obj
3>  21.556    L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\ShaderFormatOpenGL\Module.ShaderFormatOpenGL.cpp.obj
3>  21.056    L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\Core\Module.Core.2_of_8.cpp.obj
3>  16.574    L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\Core\Module.Core.8_of_8.cpp.obj
3>  16.076    L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\Projects\Module.Projects.cpp.obj
3>  15.325    L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\Core\Module.Core.1_of_8.cpp.obj
3>  10.295    L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\RHI\Module.RHI.cpp.obj
3>  8.392     L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\ShaderFormatD3D\Module.ShaderFormatD3D.cpp.obj
3>  6.954     L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\TargetPlatform\Module.TargetPlatform.cpp.obj
3>  4.446     L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\SandboxFile\Module.SandboxFile.cpp.obj
3>  1.165     L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\ShaderCompileWorker\Module.ShaderCompileWorker.cpp.obj
3>  1.114     L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\VulkanShaderFormat\Module.VulkanShaderFormat.cpp.obj
3>  1.075     L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\DesktopPlatform\Module.DesktopPlatform.cpp.obj
3>  0.812     L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\ImageWrapper\Module.ImageWrapper.cpp.obj
3>  0.779     L:\_Programming\Repositories\UnrealEngine\Engine\Intermediate\Build\Win64\ShaderCompileWorker\Development\RenderCore\Module.RenderCore.cpp.obj
3>
3>  --- Summary -----------------------------------------------------
3>                                   /----- Cache -----\
3>  Build:          Seen    Built   Hit     Miss    Store   CPU
3>   - Copy       : 20      0       -       -       -       0.000s
3>   - File       : 1229    36      -       -       -       0.001s
3>   - Object     : 25      15      -       -       -       4m 55.276s
3>   - Alias      : 1       0       -       -       -       0.000s
3>   - Exe        : 1       0       -       -       -       0.000s
3>   - Compiler   : 1       1       -       -       -       0.103s
3>   - DLL        : 19      0       -       -       -       0.000s
3>   - ObjectList : 25      6       -       -       -       0.000s
3>  Cache:
3>   - Hits       : 0 (0.0 %)
3>   - Misses     : 0
3>   - Stores     : 0
3>  Time:
3>   - Real       : 47.094s
3>   - Local CPU  : 4m 55.380s (6.3:1)
3>   - Remote CPU : 0.000s (0.0:1)
3>  -----------------------------------------------------------------
3>FBuild : error : BUILD FAILED: all
3>  Time: 47.116s
3>ERROR : UBT error : Failed to produce item: L:\_Programming\Repositories\UnrealEngine\Engine\Binaries\Win64\ShaderCompileWorker-Core.pdb
3>  XGE execution time: 51.25 seconds
3>C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V140\Microsoft.MakeFile.Targets(37,5): error MSB3075: The command "L:\_Programming\Repositories\UnrealEngine\Engine\Build\BatchFiles\Build.bat ShaderCompileWorker Win64 Development -waitmutex" exited with code 5. Please verify that you have sufficient rights to run this command.

Any insights? Is this a known issue?

BrodyHiggerson commented 8 years ago

Okay, so I've messed with this a bit this morning.

When I have bUsePDBFiles = false; bUsePCHFiles = false; in my BuildConfiguration.cs as per CLxS' advice, even if I build without any remote workers (i.e. just local), I get a whole bunch of errors from different modules, starting with VulkanShaderFormat folders, etc. When I set these both to true, the errors go away, so that's something! Must be some modules that depend on the existence of a PCH nowadays.

If I try to build with no remote workers + the above fix, it works. If I add a remote worker (by running FBuildWorker.exe on the remote machine), I get a few of the Failed to build Object (error 0xc0000135) errors, seemingly at random throughout the build process.

Is there anything I should set up on the remote worker ahead of time? It's got no VS2015, no UE4, etc, but FastBuild docs said the compile tools would be synced so this 'should' be okay. During this process, however, the FBuildWorker.exe window on the remote machine shows all cores as 'Idle'. Sounds like something's being rejected, and therefore not even processed, by/on the remote machine, causing problems for the build/primary machine? Thoughts appreciated.

liamkf commented 8 years ago

Yes, I haven't looked at the CLxS setup for awhile but I think not building with PCH/PDB is probably a code path that doesn't get tested much at Epic so I'm not surprised it doesn't work very well! We have always built with them both set to true, so that should be fine, and compiling without PCH's is very painful regardless.

But having things compile locally is definitely the first step! I think you're on the right track with the error you're getting for remote workers, it sounds like maybe we're missing some files in the compiler setup that we didn't notice for some reason. Unfortunately it's not the easiest to debug.

As a first thing to try I'd say uncomment the AddText("\t\t'$WindowsSDKBasePath$/Redist/ucrt/DLLs/x64/ucrtbase.dll'\n\n"); line and give it a shot. I believe we commented it out because we were unconvinced it was required but it's possible all our workers happened to have that SDK installed and it still is.

If that fails I would say trying to run the compiler that is sent across on the remote machine and seeing what errors it gives you. It should give you a message as to what dll may be missing and needs to be added. Not ideal I know!

BrodyHiggerson commented 8 years ago

Ahh, I may have totally overlooked an important detail! The way I read it was that you had primarily tested on Windows 10, not that it was necessarily required. You mentioning the SDK made me re-think that. If that's the case, I apologize for wasting your time.

Both the primary machine and the slave are running Windows 7 x64. Could this be the problem? I hadn't, at first thought, imagined the need for any Win10-specific functionality, but may be totally off-base.

EDIT: Also, whereabouts should I look for the compiler that's sent across, sorry? I wasn't aware of any persistent storage of that after builds have completed.

liamkf commented 8 years ago

Yeah we're all Windows 10, although I would hope that wouldn't make too much of a difference and it shouldn't be required! I'd be a bit surprised if it is, it's certainly possible it's a source of a problem though! And it'd be nice to fix if it is a problem anyways... :)

BrodyHiggerson commented 8 years ago

I've had a go with uncommenting the AddText lines: image

I then get >EXEC : 1> error : opening file 'C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\redist\x64\Microsoft.VC140.CRT\msvcr140.dll' in Compiler ToolManifest when trying to build. Looks like I don't have that .dll, but I do have the 'Visual C++ Redistributable for Visual Studio 2015' installed. Some sources point to downloading the Windows 10 SDK, but obviously that's not an option. I will continue to investigate!

image

liamkf commented 8 years ago

Oh maybe that's just a mistake on my part then! Or something about the folder layout changed, I think the vcruntime140.dll looks like the one you might want to try.

BrodyHiggerson commented 8 years ago

Ahh I managed to make it work! I left those lines commented out, and re-installed Windows on the slave machine after trying to run the cl.exe it received, as you recommended. Turns out it was missing VS2015 redist (duh), but that wouldn't install right due to other update issues. Nuked it all and built it back up, and now it works well!

I had a few additional questions, if you don't mind. I'll throw one on here, but let me know if there's a better medium.

I've recently watched the video of the FB Visualizer in action (either from yourself or a colleague, as far as I can tell) - https://www.youtube.com/watch?v=saxFpmNq_Vw - and am amazed at how fast the build actually began. I'm wondering if there's any tweaking that's been done on your end for that. I just tried, in a project with only a few files, to change a single function signature that's not referenced anywhere. This is the log:

1>------ Skipped Build: Project: EnvVarsToXML, Configuration: Development Any CPU ------ 1>Project not selected to build for this solution configuration 2>------ Skipped Build: Project: DotNETUtilities, Configuration: Development Any CPU ------ 2>Project not selected to build for this solution configuration 3>------ Build started: Project: ShaderCompileWorker, Configuration: Development_Program x64 ------ 3> Target is up to date 4>------ Build started: Project: RTGProto, Configuration: DebugGame_Editor x64 ------ 4> Creating makefile for RTGProtoEditor (working set of source files changed) 4> Building UnrealHeaderTool... 4> Target is up to date 4> Parsing headers for RTGProtoEditor 4> Running UnrealHeaderTool "L:_Programming\Repositories\RTGProto\RTGProto.uproject" "L:_Programming\Repositories\RTGProto\Intermediate\Build\Win64\RTGProtoEditor\DebugGame\RTGProtoEditor.uhtmanifest" -LogCmds="loginit warning, logexit warning, logdatabase error" -Unattended -WarningsAsErrors 4> Reflection code generated for RTGProtoEditor in 6.224401 seconds 4> Distributed Compilation : 1 Workers in pool 4> 6> Obj: L:_Programming\Repositories\RTGProto\Intermediate\Build\Win64\UE4Editor\DebugGame\SpaceServer\SpaceServer.generated.cpp.obj 4> 2> Obj: L:_Programming\Repositories\RTGProto\Intermediate\Build\Win64\UE4Editor\DebugGame\SpaceServer\TestObject.cpp.obj 4> 5> Obj: L:_Programming\Repositories\RTGProto\Intermediate\Build\Win64\UE4Editor\DebugGame\SpaceServer\TestObject.cpp.obj 4> 7> Obj: L:_Programming\Repositories\RTGProto\Intermediate\Build\Win64\UE4Editor\DebugGame\SpaceServer\SpaceServer.generated.cpp.obj 4> 4> Copy: L:_Programming\Repositories\RTGProto\Intermediate\Build\Win64\UE4Editor\DebugGame\UE4Editor-SpaceServer-Win64-DebugGame.dll.response -> L:_Programming\Repositories\RTGProto\Intermediate\Build\Win64\UE4Editor\DebugGame\UE4Editor-SpaceServer-Win64-DebugGame.dll.response.dummy 4> 1> DLL: L:_Programming\Repositories\RTGProto\Binaries\Win64\UE4Editor-SpaceServer-Win64-DebugGame.dll 4> --- Most Expensive ---------------------------------------------- 4> Time (s) Name: 4> 22.842 L:_Programming\Repositories\RTGProto\Intermediate\Build\Win64\UE4Editor\DebugGame\SpaceServer\SpaceServer.generated.cpp.obj 4> 22.722 L:_Programming\Repositories\RTGProto\Intermediate\Build\Win64\UE4Editor\DebugGame\SpaceServer\TestObject.cpp.obj 4> 1.073 L:_Programming\Repositories\RTGProto\Binaries\Win64\UE4Editor-SpaceServer-Win64-DebugGame.dll 4> 4> --- Summary ----------------------------------------------------- 4> /----- Cache -----\ 4> Build: Seen Built Hit Miss Store CPU 4> - Copy : 1 1 - - - 0.000s 4> - File : 1524 1524 - - - 0.000s 4> - Object : 2 2 - - - 45.564s 4> - Alias : 1 1 - - - 0.000s 4> - Compiler : 1 0 - - - 0.000s 4> - DLL : 1 1 - - - 1.073s 4> - ObjectList : 2 2 - - - 0.000s 4> Cache: 4> - Hits : 0 (0.0 %) 4> - Misses : 0 4> - Stores : 0 4> Time: 4> - Real : 23.939s 4> - Local CPU : 46.637s (1.9:1) 4> - Remote CPU : 0.000s (0.0:1) 4> ----------------------------------------------------------------- 4> FBuild: OK: all 4> Time: 23.965s 4> XGE execution time: 47.34 seconds ========== Build: 2 succeeded, 0 failed, 1 up-to-date, 2 skipped ==========

I have to wait between 6-10 seconds on average for the reflection code to be generated, then a fair while on the ModuleName.generated.cpp.obj type files during the build. How are you getting your build starting so fast? :D Main machine is an i7-3770k and the slave is a Xeon X5670 6-core.

Anyways, thanks for your help so far! Now I'm going to scrounge up some spare parts and build some more slaves!

EDIT: Whoops! Sorry about the close/reopen there.

Anyways, thanks also to you guys for the awesome visualizer plugin!

image

liamkf commented 8 years ago

Hmm what you're seeing is about on par with our results. I think in the video that Yassine posted Unreal did not have to do the reflection or rebuild the Unreal Header Tool, which takes the ~10 seconds you're seeing. Smaller changes I think should be able to skip some of those steps, as the Unreal Build Tool should know it doesn't have to regenerate as much. Do you see the same startup times when changing more localized headers?

I would also definitely recommend setting up and turning on the cache, as at that point you'll only be bound by how fast your main machine can preprocess the files which is nice! And changing common headers becomes much less of a headache.

The files Unreal generates are also rather enormous, so you spend a lot of time transferring things across the network (and a slow network/wifi really hurts things), I notice in your screenshot you've got a lot of timeouts in orange on the remote worker, we ended up boosting the timeouts in our FASTBuild fork quite a lot to avoid them which could also be worth investigating for you! :)

BrodyHiggerson commented 8 years ago

I'll have to try re. the headers. I was changing a header that had an empty UObject with a single non-UFUNCTION and which wasn't included anywhere.

I've set up the cache and can see what you mean! Great stuff.

After doing some tests, I found that on my main PC without the slave, a full rebuild is way faster. Here's a comparison. With:

5> --- Summary ----------------------------------------------------- 5> /----- Cache -----\ 5> Build: Seen Built Hit Miss Store CPU 5> - Copy : 324 324 - - - 1.140s 5> - File : 16374 913 - - - 0.901s 5> - Library : 25 25 - - - 1m 16.219s 5> - Object : 1230 1230 0 585 585 9h:23m 13.484s 5> - Alias : 1 1 - - - 0.000s 5> - Exe : 2 2 - - - 4.020s 5> - Compiler : 2 2 - - - 0.017s 5> - DLL : 322 322 - - - 18m 47.396s 5> - ObjectList : 1221 1221 - - - 0.000s 5> Cache: 5> - Hits : 0 (0.0 %) 5> - Misses : 585 5> - Stores : 585 5> Time: 5> - Real : 1h:14m 10.930s 5> - Local CPU : 9h:43m 23.180s (7.9:1) 5> - Remote CPU : 1h:27.022s (0.8:1) 5> ----------------------------------------------------------------- 5> FBuild: OK: all 5> Time: 74m 11.269s 5> XGE execution time: 4577.78 seconds ========== Build: 3 succeeded, 0 failed, 0 up-to-date, 2 skipped ==========

Without:

5> Total build time: 1871.11 seconds ========== Build: 3 succeeded, 0 failed, 0 up-to-date, 2 skipped ==========

I am definitely getting a lot of timeouts on the slave machine, but this is a pretty crazy difference in time. I realize it might not be worth it with a small build + few machines, but this is getting an extra 12 threads in on an entire engine build! Should be at least on-par, if not faster. Any thoughts? Maybe the timeouts are responsible, but I'm not sure I understand why they happen.

Did your fork's changes simply allow building of files to take longer before being considered 'timed out'? What kind of bottlenecks create such a thing? The Xeon machine is running with an SSD and plenty of RAM - not sure why it'd be having such troubles.

Thanks again, Liam.

liamkf commented 8 years ago

Hmmmm! That doesn't seem great! I notice the first build has everything not in the cache, was the second non-dist build also with an empty cache? I don't suppose pictures of the two builds in the visualizer are available? (that 4600 second compile hurts... so no worries if not) @yass007 might have some ideas as well!

There are some definite differences when using distributed in that it will not use the pch for remote files but still has to preprocess all of them, so those files will compile slower than a normal local build and there's some overhead. It should still be faster (bonus cores!) as long as the preprocessing takes less time than the compilation, and any files that are cache hits should be much faster and most files should be cache hits after your first build.

That being said a few timeouts can really penalize the build times, and we were usually seeing the timeouts AFTER the file had compiled and was starting to transfer the results back, so sometimes a minute or two wasted per file which the builder would then have to restart locally. Our computers were also fairly decent, it really seemed to be windows trying to transfer 8-12 fairly huge files back and forth, we traced it to the socket call just not finishing before FASTBuild would give up on it. As mentioned here it's on our list of things to investigate better when we get some time.

yass007 commented 8 years ago

Yeah Liam you said it all.. This is not surprising. Timeouts cost a lot on the Host side because of: 1) added preprocessing 2) not benefiting from pch(s) when executing jobs from the remote job queue

So if that cost is paid and on top of that nothing get compiled on the remote worker (timeouts) this could be a net loss. As mentioned by Liam, we ended up fixing the timeouts and preventing the host from uselessly preprocessing jobs if the remote job queue was getting full above a certain threshold which makes it compile them with PCHs (faster). After doing this we were able to restore the expected gains.

We'll try to document the timeouts and the remote job queue optimization and open a ticket for these.

BrodyHiggerson commented 8 years ago

Hi Yassine, thanks for joining in!

I grabbed one of your organization's forks of FASTBuild, but had to revert the change in https://github.com/liamkf/fastbuild/commit/bc4ca0f6fb5f327ab16cf67b876467fa71528db9 where you moved to GetSystemTimePreciseAsFileTime, since it's not supported on Windows 7. Otherwise, does this repo contain all of the changes you're talking about? It's definitely seemed to help.

This has fixed the timeouts, but a full rebuild (clean -> build) without cache has still taken longer than a non-FASTBuild 8-core build. It's now down to 2826.74 seconds, though. Please find the build log here. @liamkf when I was referring to the non-distributed build taking just over 30 minutes, I meant without using FASTBuild. Not as fair a comparison I know, but still concerning.

I've also noticed that my build machine's network is getting maxed out since my crappy router is 100 Mbps only, as per the following.

image

I'll be upgrading that too, esp. considering that I'll be adding more machines in soon. I wonder if that's playing a big part in this.

yass007 commented 8 years ago

Cool stuff!

GetSystemTimePreciseAsFileTime has been changed back the more portable version on the official Fastbuild integration.

The only thing you are still missing is the hacky optimization that allows to stop preprocessing more files on the host if we detect we have too many queued already. This has not been submitted yet.

You can add this in ObjectNode.cpp, in /virtual/ Node::BuildResult ObjectNode::DoBuild( Job * job ) :

if (usePreProcessor && JobQueue::Get().GetDistributableJobsMemUsage() > 128 * 1024 * 1024)
{
    FLOG_BUILD("Skipping Dist Job! ( %d MB)\n", JobQueue::Get().GetDistributableJobsMemUsage() / MEGABYTE);

    usePreProcessor = false;
}

Right before this line:

if ( usePreProcessor )
{
    return DoBuildWithPreProcessor( job, useDeoptimization, useCache );
}

Let us know if the timout fix + this optim resolves the perf issue.

BrodyHiggerson commented 8 years ago

Hi Yassine,

Thanks for that! I added that in. When testing, I made sure to clear the cache and do a full clean -> build test. With the cache fully disabled, I got 1713s, and with it enabled but empty, I got 1616s. Probably unrelated to the cache being enabled or not - they're pretty similar. Build log here.

Does this stack up against the results you've seen? That's with a main 3770k + 16gb RAM with a x5670 Xeon (6-core with HT) + 24gb RAM. What kind of numbers are you guys getting, and with what kind of setup?

I only ask because shaving 4-5 minutes off of a ~31m build time (the build time without FASTBuild) after adding 12 'cores' seems less dramatic that I expected. I'll be adding another computer soon to see what happens!

Thanks for all your help, and for sharing the optimizations! I'll try a run with Incredibuild to see if the 'benchmark' for distributed builds is similar, in which case I'm just being unfair!

liamkf commented 8 years ago

Hmm, I can't recall what our overall build times were, and our available setup has changed. Yassine might remember better, but I think they were a bit faster than what you're seeing, in the 15 minute range for the game (the editor takes a lot longer... :/), we had three machines with 4 core (with HT) and one 6 core (with HT) machine, all with SSDs, on a gigabit network, but with two of the machines on wifi which tended to hurt quite a bit.

Can we get a picture of the visualizer for one of your builds? The balance between preprocessing (blue) in order to allow distribution vs. the compile times on the remote machines is a bit delicate, as the local compilation using the pch files is quite a bit faster than the remote machines, so it will show benefits if there's always enough work for the local machine to do and it's not preprocessing for nothing...

BrodyHiggerson commented 8 years ago

With 32GB RAM upgrade in main PC, full build of DebugGame Editor (with cleared cache) resulted in:

5> --- Summary ----------------------------------------------------- 5> /----- Cache -----\ 5> Build: Seen Built Hit Miss Store CPU 5> - Copy : 324 324 - - - 1.026s 5> - File : 16374 913 - - - 0.000s 5> - Library : 25 25 - - - 18.596s 5> - Object : 1230 1230 0 321 321 2h:46m 6.927s 5> - Alias : 1 1 - - - 0.000s 5> - Exe : 2 2 - - - 2.323s 5> - Compiler : 2 2 - - - 0.010s 5> - DLL : 322 322 - - - 8m 29.506s 5> - ObjectList : 1221 1221 - - - 0.000s 5> Cache: 5> - Hits : 0 (0.0 %) 5> - Misses : 321 5> - Stores : 321 5> Time: 5> - Real : 23m 15.328s 5> - Local CPU : 2h:54m 58.388s (7.5:1) 5> - Remote CPU : 3h:31m 56.427s (9.1:1) 5> ----------------------------------------------------------------- 5> FBuild: OK: all 5> Time: 23m 15.666s 5> XGE execution time: 1513.00 seconds ========== Build: 3 succeeded, 0 failed, 0 up-to-date, 2 skipped ==========

Shots of Visualizer for this build:

image

image

Build log here.

Thoughts? Definitely better than before. Putting together an i7-2700k machine today, hopefully. Will be interesting to see the results of that.

(As an aside, what's the difference between 'Time: 23m 15.666s' and 'XGE execution time: 1513.00 seconds' in the above? The last figure equates to 25m 12s. What do the two figures represent? FASTBuild's part of the build compared to FB + the UE4 side (make files, etc)?)

liamkf commented 8 years ago

Sorry about the delay! Yeah from the visualizer that looks about like what I'd expect, remote nodes taking about twice as long without the help from the PCH, and not preprocessing too much on the local builder. We think if we do a pass which collapses the nodes using the same PCH we might be able improve the performance a bit.

CAptainMXD commented 2 years ago

when I use fastbuild and change my cpp file, and click the build but fastbuild has nothing change except "up to date"