Optimized reading compiler output in invokeRealCompiler()

frerich / clcache

A compiler cache for MSVC, much like ccache for gcc

Other

324 stars 83 forks source link

Optimized reading compiler output in invokeRealCompiler() #230

Closed frerich closed 8 years ago

frerich commented 8 years ago

As it turns out, using two PIPEs with Popen() makes it fork two threads internally for reading stdout/stderr, and synchronising those threads seems to take very long. Some Google research suggests that the threads can be avoided by writing to temporary files.

codecov-io commented 8 years ago

Current coverage is 88.90% (diff: 100%)

Merging #230 into master will increase coverage by 0.05%

@@             master       #230   diff @@
==========================================
  Files             1          1          
  Lines           986        991     +5   
  Methods           0          0          
  Messages          0          0          
  Branches        156        156          
==========================================
+ Hits            876        881     +5   
  Misses           83         83          
  Partials         27         27

Powered by Codecov. Last update 76332b3...2ce37f5

siu commented 8 years ago

Just some feedback: these changes don't change the timings on my side.

webmaster128 commented 8 years ago

Just some feedback: these changes don't change the timings on my side.

Do you have compiler output (like /showIncludes)?

When cl only prints the source file name on one output stream stream, it is unlikely that this helps.

frerich commented 8 years ago

For the record, this PR was meant to address #229 but didn't help in that case either -- however, the original profile data (I summed things up in https://github.com/frerich/clcache/issues/229#issuecomment-249355242) changed considerably. Waiting for mutexes was no longer high up on the list.

siu commented 8 years ago

@webmaster128 I tried with cold cache in direct mode which should add "/showIncludes". Apart from that the compiler output is very minimal in my test case, so it is maybe not relevant for benchmarking.

I guess it can be merged if it helps in #229 with a different workload, at least it didn't hurt in my use case.