frerich / clcache

A compiler cache for MSVC, much like ccache for gcc
Other
325 stars 83 forks source link

Add msbuild tracker support #319

Open izmmisha opened 6 years ago

izmmisha commented 6 years ago

How msbuild+tracker works: Tracker injecting it's dll to cl process and intercepting file/process/thread related functions, if child process created it also would be prepared for tracking. Here is example of output when TRACKER_TRACE=1 environ set:

1>  [8696,C:\PROGRAM FILES (X86)\MICROSOFT VISUAL STUDIO 12.0\VC\BIN\X86_AMD64\CL.EXE] AttachDetours: Attach started
1>  [8696,C:\PROGRAM FILES (X86)\MICROSOFT VISUAL STUDIO 12.0\VC\BIN\X86_AMD64\CL.EXE] AssignMinWinRealFunctions: Attach started
1>  [8696] GetRealProcAddress is using the export table to get the location of (CreateDirectoryA) from KernelBase.dll.
1>  [8696] GetRealProcAddress is using the export table to get the location of (CreateDirectoryW) from KernelBase.dll.
1>  [8696] GetRealProcAddress is using the export table to get the location of (CreateFileA) from KernelBase.dll.
1>  [8696] GetRealProcAddress is using the export table to get the location of (CreateFileW) from KernelBase.dll.
1>  [8696] GetRealProcAddress is using the export table to get the location of (CreateProcessA) from KernelBase.dll.
1>  [8696] GetRealProcAddress is using the export table to get the location of (CreateProcessW) from KernelBase.dll.
1>  [8696] GetRealProcAddress is using the export table to get the location of (CreateThread) from KernelBase.dll.
1>  [8696] GetRealProcAddress is using the export table to get the location of (DeleteFileA) from KernelBase.dll.
1>  [8696] GetRealProcAddress is using the export table to get the location of (DeleteFileW) from KernelBase.dll.
1>  [8696] GetRealProcAddress is using the export table to get the location of (GetFileAttributesA) from KernelBase.dll.
1>  [8696] GetRealProcAddress is using the export table to get the location of (GetFileAttributesW) from KernelBase.dll.
1>  [8696] GetRealProcAddress is using the export table to get the location of (GetFileAttributesExA) from KernelBase.dll.
1>  [8696] GetRealProcAddress is using the export table to get the location of (GetFileAttributesExW) from KernelBase.dll.
1>  [8696] GetRealProcAddress is using the export table to get the location of (RemoveDirectoryA) from KernelBase.dll.
1>  [8696] GetRealProcAddress is using the export table to get the location of (RemoveDirectoryW) from KernelBase.dll.
1>  [8696] GetRealProcAddress is using the export table to get the location of (TerminateProcess) from KernelBase.dll.
1>  [8696] GetRealProcAddress found function (ExitProcess) using GetProcAddress when searching KernelBase.dll.
1>  [8696] GetRealProcAddress is using the export table to get the location of (DisableThreadLibraryCalls) from KernelBase.dll.
1>  [8696,C:\PROGRAM FILES (X86)\MICROSOFT VISUAL STUDIO 12.0\VC\BIN\X86_AMD64\CL.EXE] AssignMinWinRealFunctions: Attach complete
1>  [8696,C:\PROGRAM FILES (X86)\MICROSOFT VISUAL STUDIO 12.0\VC\BIN\X86_AMD64\CL.EXE] AttachDetours: Attaching to Real_CreateFileA (0x76f4d2b0) in C:\WINDOWS\System32\KERNELBASE.dll
1>  [8696,C:\PROGRAM FILES (X86)\MICROSOFT VISUAL STUDIO 12.0\VC\BIN\X86_AMD64\CL.EXE] AttachDetours: Attaching to Real_CreateFileW (0x76f2f6b0) in C:\WINDOWS\System32\KERNELBASE.dll
1>  [8696,C:\PROGRAM FILES (X86)\MICROSOFT VISUAL STUDIO 12.0\VC\BIN\X86_AMD64\CL.EXE] AttachDetours: Attaching to Real_CopyFileA (0x73eedd40) in C:\WINDOWS\System32\KERNEL32.DLL
1>  [8696,C:\PROGRAM FILES (X86)\MICROSOFT VISUAL STUDIO 12.0\VC\BIN\X86_AMD64\CL.EXE] AttachDetours: Attaching to Real_CopyFileW (0x73f0fe60) in C:\WINDOWS\System32\KERNEL32.DLL
1>  [8696,C:\PROGRAM FILES (X86)\MICROSOFT VISUAL STUDIO 12.0\VC\BIN\X86_AMD64\CL.EXE] AttachDetours: Attaching to Real_CopyFileExA (0x73eeded0) in C:\WINDOWS\System32\KERNEL32.DLL
1>  [8696,C:\PROGRAM FILES (X86)\MICROSOFT VISUAL STUDIO 12.0\VC\BIN\X86_AMD64\CL.EXE] AttachDetours: Attaching to Real_CopyFileExW (0x73eb4060) in C:\WINDOWS\System32\KERNEL32.DLL
1>  [8696,C:\PROGRAM FILES (X86)\MICROSOFT VISUAL STUDIO 12.0\VC\BIN\X86_AMD64\CL.EXE] AttachDetours: Attaching to Real_CreateHardLinkA (0x73eb4200) in C:\WINDOWS\System32\KERNEL32.DLL
1>  [8696,C:\PROGRAM FILES (X86)\MICROSOFT VISUAL STUDIO 12.0\VC\BIN\X86_AMD64\CL.EXE] AttachDetours: Attaching to Real_CreateHardLinkW (0x73eb4220) in C:\WINDOWS\System32\KERNEL32.DLL
1>  [8696,C:\PROGRAM FILES (X86)\MICROSOFT VISUAL STUDIO 12.0\VC\BIN\X86_AMD64\CL.EXE] AttachDetours: Attaching to Real_CreateProcessA (0x76fd1770) in C:\WINDOWS\System32\KERNELBASE.dll
1>  [8696,C:\PROGRAM FILES (X86)\MICROSOFT VISUAL STUDIO 12.0\VC\BIN\X86_AMD64\CL.EXE] AttachDetours: Attaching to Real_CreateProcessW (0x76f1b530) in C:\WINDOWS\System32\KERNELBASE.dll
1>  [8696,C:\PROGRAM FILES (X86)\MICROSOFT VISUAL STUDIO 12.0\VC\BIN\X86_AMD64\CL.EXE] AttachDetours: Attaching to Real_CreateThread (0x76fce430) in C:\WINDOWS\System32\KERNELBASE.dll
1>  [8696,C:\PROGRAM FILES (X86)\MICROSOFT VISUAL STUDIO 12.0\VC\BIN\X86_AMD64\CL.EXE] AttachDetours: Attaching to Real_CreateDirectoryA (0x76f27820) in C:\WINDOWS\System32\KERNELBASE.dll
1>  [8696,C:\PROGRAM FILES (X86)\MICROSOFT VISUAL STUDIO 12.0\VC\BIN\X86_AMD64\CL.EXE] AttachDetours: Attaching to Real_CreateDirectoryW (0x76f3a680) in C:\WINDOWS\System32\KERNELBASE.dll
1>  [8696,C:\PROGRAM FILES (X86)\MICROSOFT VISUAL STUDIO 12.0\VC\BIN\X86_AMD64\CL.EXE] AttachDetours: Attaching to Real_RemoveDirectoryA (0x76fd0710) in C:\WINDOWS\System32\KERNELBASE.dll
1>  [8696,C:\PROGRAM FILES (X86)\MICROSOFT VISUAL STUDIO 12.0\VC\BIN\X86_AMD64\CL.EXE] AttachDetours: Attaching to Real_RemoveDirectoryW (0x76fd0750) in C:\WINDOWS\System32\KERNELBASE.dll
1>  [8696,C:\PROGRAM FILES (X86)\MICROSOFT VISUAL STUDIO 12.0\VC\BIN\X86_AMD64\CL.EXE] AttachDetours: Attaching to Real_GetFileAttributesExA (0x76f4e4f0) in C:\WINDOWS\System32\KERNELBASE.dll
1>  [8696,C:\PROGRAM FILES (X86)\MICROSOFT VISUAL STUDIO 12.0\VC\BIN\X86_AMD64\CL.EXE] AttachDetours: Attaching to Real_GetFileAttributesExW (0x76f2f410) in C:\WINDOWS\System32\KERNELBASE.dll
1>  [8696,C:\PROGRAM FILES (X86)\MICROSOFT VISUAL STUDIO 12.0\VC\BIN\X86_AMD64\CL.EXE] AttachDetours: Attaching to Real_GetFileAttributesA (0x76f4d790) in C:\WINDOWS\System32\KERNELBASE.dll
1>  [8696,C:\PROGRAM FILES (X86)\MICROSOFT VISUAL STUDIO 12.0\VC\BIN\X86_AMD64\CL.EXE] AttachDetours: Attaching to Real_GetFileAttributesW (0x76f2dca0) in C:\WINDOWS\System32\KERNELBASE.dll
1>  [8696,C:\PROGRAM FILES (X86)\MICROSOFT VISUAL STUDIO 12.0\VC\BIN\X86_AMD64\CL.EXE] AttachDetours: Attaching to Real_DeleteFileA (0x76f4ec40) in C:\WINDOWS\System32\KERNELBASE.dll
1>  [8696,C:\PROGRAM FILES (X86)\MICROSOFT VISUAL STUDIO 12.0\VC\BIN\X86_AMD64\CL.EXE] AttachDetours: Attaching to Real_DeleteFileW (0x76f2dfb0) in C:\WINDOWS\System32\KERNELBASE.dll
1>  [8696,C:\PROGRAM FILES (X86)\MICROSOFT VISUAL STUDIO 12.0\VC\BIN\X86_AMD64\CL.EXE] AttachDetours: Attaching to Real_TerminateProcess (0x76f27ef0) in C:\WINDOWS\System32\KERNELBASE.dll
1>  [8696,C:\PROGRAM FILES (X86)\MICROSOFT VISUAL STUDIO 12.0\VC\BIN\X86_AMD64\CL.EXE] AttachDetours: Attaching to Real_ExitProcess (0x7731fc90) in C:\WINDOWS\SYSTEM32\ntdll.dll
1>  [8696,C:\PROGRAM FILES (X86)\MICROSOFT VISUAL STUDIO 12.0\VC\BIN\X86_AMD64\CL.EXE] AttachDetours: Attaching to Real_DisableThreadLibraryCalls (0x76f4a060) in C:\WINDOWS\System32\KERNELBASE.dll
1>  [8696,C:\PROGRAM FILES (X86)\MICROSOFT VISUAL STUDIO 12.0\VC\BIN\X86_AMD64\CL.EXE] AttachDetours: Attach complete

After process exits it dump collected info into tlog files, output file names depends on process name, thread. So parent process (CL) would have CL.{read,write}.{1,2,3,4,...}.tlog, it's child would have different name, it depends on process name, if it is CL then tracker will add PID to filename CL.{CHILDPID}.{read,write}.{1,2,3,4,...}.tlog, but if process name is differs from parent (PYTHON for example) then it would create filename CL-PYTHON.{read,write}.{1,2,3,4,...}.tlog. This is important for next step. msbuild after tracker stop inspecting output files and create only two files cl.read.1.tlog and cl.write.1.tlog, it looking only into cl.{read,write}.1.tlog (for single instance CL) and cl.*.{read,write}.1.tlog (for multiple CL when /MP used).

To make tracker compatible clcache it is necessary that clcache process name and it's childs must have name CL. PyInstaller generated executable works fine (CL.{PID}.{PID}.read.1.tlog would be generated), but clcache.exe from python/scripts/ folder isn't compatible, because it's child process name is PYTHON (CL-PYTHON-CL.read.1.tlog would be generated) and this file ignored by msbuild.

Another thing is /MP, currently clcache using thread pool, and it produce cl.*.read.{2,3,4,5,6..}.tlog files, it is necessary to use process pool.

How to avoid STATS.TXT tracking: As mentioned before msbuild inspecting only cl..read.1.tlog, so it is expecting reads from main thread, and doesn't looking into cl..read.2.tlog. So to avoid STATS.TXT reading to be tracked we can move it to dedicated thread and cl.*.read.2.tlog file.

Also manifest reading should be untrackable because if header files changed then new record would be added to same manifest file.

The changes contains detecting of tracker by TRACKER_ENABLED environ variable, if it present then it will use process pool as executor and will read stats.txt and manifest files from dedicated thread. But if tracker is not detected then clcache will work as before.

codecov-io commented 6 years ago

Codecov Report

Merging #319 into master will decrease coverage by 0.85%. The diff coverage is 68.51%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #319      +/-   ##
==========================================
- Coverage   88.93%   88.07%   -0.86%     
==========================================
  Files           4        4              
  Lines        1301     1350      +49     
  Branches      195      207      +12     
==========================================
+ Hits         1157     1189      +32     
- Misses        106      117      +11     
- Partials       38       44       +6
Flag Coverage Δ
#integrationtests_memcached 66.54% <37.03%> (-1.08%) :arrow_down:
#unittests 85.11% <68.51%> (-0.75%) :arrow_down:
Impacted Files Coverage Δ
clcache/__main__.py 89.81% <68.51%> (-1.07%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update cae73d8...87f41b5. Read the comment docs.

izmmisha commented 6 years ago

How to avoid STATS.TXT tracking: As mentioned before msbuild inspecting only cl..read.1.tlog, so it is expecting reads from main thread, and doesn't looking into cl..read.2.tlog. So to avoid STATS.TXT reading to be tracked we can move it to dedicated thread and cl.*.read.2.tlog file.

There is better way to avoid tracking clcache internal stuff - use FileTracker.dll API SuspendTracking/ResumeTracking. This solution will works with v140 toolchain which inspecting all threads instead main thread like v120.

pthom commented 6 years ago

Hi, I tested this PR on my company code (2000 cpp files, 173 projects) and it seems very promising ! I did have some failures when (timeout, may be because of the multiprocessor pool), but a simple suibsquent build will always succeed.

Many thanks !

PS : you can look at the PR https://github.com/frerich/clcache/pull/330 where I propose a script in order to ease the integration with Visual Studio and msbuild.

frerich commented 5 years ago

Your work sounds very promising! I don't use msbuild myself, but given that it seems to help, I'd be happy to merge your work into the main clcache source tree.

izmmisha commented 5 years ago

@frerich thank you.

I will continue work on this PR, need to fix usage of ProcessPoolExecutor only with VS2013, otherwise ThreadPoolExecutor is better.

Cascades commented 5 years ago

Following on from the conversation here I would say the cl.exe version would be the best way to grab the toolset version using the mapping here.

From what I can tell on this SO thread the best ways would be to either run cl without any args and parse the output, or to run a cpp file with only _MSC_VER in to cl /nologo /EP to end up with a file containing the cl version... I can't tell what's the nicer option... but I'd lean towards running cl with no arguments to keep things within python.

Cascades commented 5 years ago

@izmmisha Any suggestions for codecov? I'm not too familiar with it tbh.

izmmisha commented 5 years ago

@Cascades lines 1758-1763 can be covered by unit tests lines 160,164 require to change initialization order of coverity and clcache modules (clcache.pth) lines 132-135 can be covered by changing matrix of testing environment

everything of this, except unit tests, IMO may be ignored :)

Cascades commented 5 years ago

@Cascades lines 1758-1763 can be covered by unit tests lines 160,164 require to change initialization order of coverity and clcache modules (clcache.pth) lines 132-135 can be covered by changing matrix of testing environment

everything of this, except unit tests, IMO may be ignored :)

Not sure how to "ignore things"... but unit tests in!

Cascades commented 5 years ago

@izmmisha we think this about done then? :)

izmmisha commented 5 years ago

@Cascades if you are able to check that it will help with https://github.com/frerich/clcache/issues/357 then we can say is it done or not.

Cascades commented 5 years ago

@izmmisha I am tresting it now.. so with these changes I wont need TrackFileAccess = false anymore and I should see that after hitting build a second time, CMake will not see anything as out of date?

izmmisha commented 5 years ago

@Cascades also you don't need VS_GLOBAL_NoDependency $ENV{CLCACHE_DIR}/stats.txt

Cascades commented 5 years ago

@izmmisha Yes! I did remove that too.. currently building so I'll get back to you

Cascades commented 5 years ago

@izmmisha @frerich Yes! This works a dream!! My output on VS2019 with CMake is now it saying it doesn't need to update anything because it's already been built, as expected.

izmmisha commented 5 years ago

@Cascades glad to hear that it works

Cascades commented 5 years ago

@frerich sorry to bother, but these changes are incredibly useful for MSBuild users, our build times upon second build go from 3 mins with clcache to 2 secs (which is what it would be without clcache). Do you know when these changes will be merged in?

danix800 commented 5 years ago

This happily working useful feature is dying, which means this project is really dying...

frerich commented 5 years ago

@Cascades @danix800 I never actually used MSBuild myself - in fact, I stopped using clcache many years ago since I'm not working witht Visual Studio and C++ as much as I used to. However, I'd feel a bit bad about blindly merging functionality without having at least tried i myself - which is why this promising PR did not get merged yet.

I believe many users of clcache would appreciate if this project would find a new maintainier - let me know if anyone is interested!