spotify / XCRemoteCache

Other
827 stars 50 forks source link

In my integration setup, incremental RC builds with expected cache misses take longer than clean builds without RC #133

Closed cezarsignori closed 2 years ago

cezarsignori commented 2 years ago

My integration setup

XCRemoteCache: 0.3.11 or c39b2862 (support for irrelevant_dependencies_paths) HTTP cache server: Local Demo Docker Xcode: 13.2.1 (13C100) Manual integration via Buck

Problem & Expectation

In my current setup, Xcode takes longer to finish an incremental build with RC than it takes to clean build without RC.

My expectation is that the same incremental build with RC takes at least the same amount of time as without it.

My definition of incremental build for this thread is a build of a new commit ran after a clean build of a previous commit without deleting DerivedData.

Minimal reproduction of the problem with instructions

  1. Checkout to a commit built by an RC producer (e.g. 0886f8e6)
  2. Delete DerivedData, run xcprepare and build app using RC as consumer. The app builds in 100s with 100% cache hit.
  3. Checkout to a commit not yet built by an RC producer (1 or 2 commits later is typically enough, e.g. 4b186ef2).
  4. Run xcprepare and build the app using RC as consumer. The app builds in 1000s with expected cache misses.

Investigation

As I watch Xcode incrementally build 4b186ef2 with RC I noticed:

I used xclogparser to compare the .xcactivitylog file for incrementally building 4b186ef2 with RC after clean building 0886f8e6 with RC VS incrementally building 4b186ef2 without RC after clean building 0886f8e6 without RC. What I found is that RC build steps resolve very quickly but those modules that need to compile take much longer.

So I picked a random module (TripsService).

And saw that in the RC incremental build:

And the no-RC incremental build:

I verified that TripsService is a mixed module and all Objective-C files do belong to the module. I ran a git diff between both commits and saw that only Swift files were modified in the TripsService module.

The RC logs look normal / show what I expect it to show.

Other

polac24 commented 2 years ago

My expectation is that the same incremental build with RC takes at least the same amount of time as without it.

Unfortunately, that is not always true on a target level. In the non-RC build, xcodebuild can make an incremental build (e.g. hot DerivedData/ModuleCache), while if you have to build it locally with RC because of a cache miss, clang and swift have to build everything from scratch. However, on a project level there still should be a benefit, unless your cache hit is really low.

After RC prebuild step (0.120s), it took 5 seconds for Swift files start compiling

These 5s probably are caused by "backfilling" ObjC compilation - as you said, your target is a hybrid target so on a cache miss we have to compile both Swift&ObjC. You could inspect which clang invocations are called at $(TARGET_TEMP_DIR)/history.compile. That is a binary file, but its format is really simple, \0 is used as a delimiter. Can you ensure that there are no duplications?

It seems to compiles target serially (same impression as without RC)

Ups, that is a really bad sign. Achieving a good performance with XCRemoteCache for a serialized target compilation will be hard. It was optimized for projects with good parallelization. The good news it that there are some optimizations we can do (e.g. call above mentioned "ObjC backfilling" in parallel or outsource ObjC backfilling to Xcode). Is the serialized nature of a build intentional?

Run xcprepare and build the app using RC as consumer. The app builds in 1000s with expected cache misses.

Out of curiosity, what hit rate you have?

Simply re-building the same incremental build (e.g. 8794cfe1) results in a 13s build.

Hmm, that is still a lot, don't you thin. Which steps are invoked by Xcode? I would expect no steps at all so the build should be in the ballpark of couple of seconds. Is that similar for non-RC no-op build?

cezarsignori commented 2 years ago

To answer some of your questions I will have to re-run the tests scenarios, so I will address some of them now and share a few additional learnings. And then come back to you later.

Is the serialized nature of a build intentional?

No. There is a good level of parallelism on clean builds. But on incremental builds, there seem to be much less. Looking at the xclogparser HTML report I see 2/3s of the app building in serial. But looking at Xcode build logs with "Errors Only" selected, it seems like much more. To some extent this might be normal, as the cache misses are either the changed module or its reverse dependencies.

Out of curiosity, what hit rate you have?

354 cache hits and 25 cache misses.

New Learnings

cezarsignori commented 2 years ago

These 5s probably are caused by "backfilling" ObjC compilation - as you said, your target is a hybrid target so on a cache miss we have to compile both Swift&ObjC.

Fortunately one of the modules that is built locally is pure-Swift (FlightsUIComponents). For that module xclogparser shows that Xcode without RC takes a total of 0.8s and that all Swift files in that target were built in parallel within 2.1s. And for Xcode with RC it shows a total of 1.4s with Swift files build starting serially (but running in parallel) and within 0.6s to 0.9s depending on the file. The delay to start the Compile Swift Sources does not seem to be there. Also, the delay to start the Compile Swift Sources seems to be related to the module size.

You could inspect which clang invocations are called at $(TARGET_TEMP_DIR)/history.compile. That is a binary file, but its format is really simple, \0 is used as a delimiter. Can you ensure that there are no duplications?

The clean build has a history.compile file but the incremental build does not. The rc.enabled is also deleted by the incremental build. The history.compile file generated by the clean build contains no \0.

I found the history.compile file is not present for any modules that missed cache in incremental builds.

Shouldn't those targets also have that file in incremental builds?

The good news is that there are some optimizations we can do (e.g. call above mentioned "ObjC backfilling" in parallel or outsource ObjC backfilling to Xcode).

How?

cezarsignori commented 2 years ago

Simply re-building the same incremental build (e.g. 8794cfe1) results in a 13s build. Hmm, that is still a lot, don't you think? Which steps are invoked by Xcode? I would expect no steps at all so the build should be in the ballpark of couple of seconds. Is that similar for non-RC no-op build?

11 seconds are spent on creating the build description and the rest is spent on running the custom swift lint and warning generation scripts we have. The behavior on non-RC builds is the same.

ahmednafei commented 2 years ago

Hi Bartosz,

I am attaching a screenshot of how our app compiles as a clean build without RC, to show the level of parallelism (please see image below). Usually a clean build without RC finishes in ~600secs. For a clean build with RC we have the same level of parallelism and app builds in ~150secs for 100% cache hit.

clean build

Now after doing a clean build either with or without RC, when we run an incremental build where only an empty new line to a single ObjC file inside "Core" module (that has +200 reverse dependencies), we see:

Without RC case:

Xcode is smart enough to ONLY compile the single file touched in "Core" module, generate static lib for the module, and not compile any other files (only traversing all the reverse dependencies). The traversal is serial, but targets are processed very fast, as the introduced change does not really affect any reverse dependency files. The compilation finishes in 75 seconds. About 30 seconds are consumed to package the app binary, generate dSym files and run custom build scripts.

RC case:

RC invalidates the whole "Core" module build and rebuilds it from scratch (all files). Same for all +200 reverse dependencies. The targets are also traversed serially. But we think that is how Xcode deals with incremental builds (trying to be smart). Adding to this, we also noted that cache disabled mixed modules build in double or triple the normal time taken to clean build the same module without RC.

So now we see two points:

  1. How can we make RC behave similar to smart Xcode behavior to not fully invalidate the touched module, nor its reverse dependencies and only compile impacted source files?
  2. It is not clear to us why fully compiling mixed modules with RC takes double or triple the time taken to cleanly build the same module without RC. We notice a freeze time for compiler (on the parent "Compile Swift Sources'' task) before starting to actually compile the swift files and also longer Swift file compilation time.
polac24 commented 2 years ago

How? FYI: I have a very POC branch but have to still test it (probably early next week): https://github.com/polac24/XCRemoteCache/commit/96ce18bc3127f4dfc1d5c3403230cabd38d215f8

How can we make RC behave similar to smart Xcode behavior to not fully invalidate the touched module, nor its reverse dependencies and only compile impacted source files?

XCRemoteCache doesn't work on a file level, only per target. That was a design decisions and rational is described in the blogpost. It means that modifying locally a very root target Core, invalidates all reverse dependencies (aka dependee).

It is not clear to us why fully compiling mixed modules with RC takes double or triple the time taken to cleanly build the same module without RC.

Numbers provided by you @ahmednafei and @cezarsignori are suspicious. Commingling Core locally shouldn't add any delay (like mentioned 5s) not have worse significantly worse performance than a clean Core build. I write "significantly" because there is a shared modulecache DerviedData/ModuleCache that locally has to be rebuilt once. To compare time to regenerate that cache, you could delete that directory in the non-RC incremental build and observe how it affects the Core target build time. Maybe the fix mentioned above could help the freeze issue but still not sure why swift files are compiled 2-3 times slower. Is it possible that regenerating ModuleCache is that expensive?

From the attached image the clean build has nice parallelization, but you mentioned that an incremental build doesn't behave like that even without RC so it is a nature of your project. I would investigate why is that happening because I recall a problem like that in my project. To fix I narrowed shell script build steps' output files.

cezarsignori commented 2 years ago

FYI: I have a very POC branch but have to still test it (probably early next week): polac24@96ce18b

Thanks @polac24. I imagined you did :P

XCRemoteCache doesn't work on a file level, only per target. That was a design decision and rational is described in the blogpost.

Do you mean this part?

"The Xcode build system knows which files are the actual input files for the compilation, but that list is generated as compiler’s output (.d files) and is not available ahead of a compilation."

If yes, since the list is available in the meta file after the producer build, couldn't there be a file-level SHA that is checked once the target-level SHA check fails? And then move only the matching SHA files to the expected location causing only the other files to compile?

Assuming that works, did you test or have concerns with payload size?

I am looking to understand better the problem and would appreciate if you could tell me more.

Numbers provided by you @ahmednafei and @cezarsignori are suspicious. Commingling Core locally shouldn't add any delay (like mentioned 5s) not have worse significantly worse performance than a clean Core build.

This issue seems to be strictly reproducible with mixed modules. We disabled RC integration for those and noticed more reasonable incremental build times across different scenarios (15s, 19s, 65s, 100s). As expected clean build time increased (from 100s-120s to 338s-435s with localhost demo server). More testing needs to be done to confirm.

Is it possible that regenerating ModuleCache is that expensive?

No. I clean built without RC, then deleted ~/Library/Developer/Xcode/DerivedData/ModuleCache.noindex, made the same change to the same "Core" module and Xcode incrementally built the app in 38s (vs 1664s with RC).

I also did the test were I clean build without RC, deleted DerivedData/ModuleCache.noindexand git checkout to another commit (same commits as previous tests), and Xcode incrementally built the app in 179s (vs 1000s with RC).

From the attached image the clean build has nice parallelization, but you mentioned that an incremental build doesn't behave like that even without RC so it is a nature of your project. I would investigate why is that happening because I recall a problem like that in my project. To fix I narrowed shell script build steps' output files.

TBH it's been a while I don't work on another Xcode project full of source targets. Do you mean in other projects Xcode this behavior is not verifiable on incremental builds? I will clone some random big project to compare.

What do you mean by "narrowed shell script build steps' output files"?

polac24 commented 2 years ago

Do you mean this part? If yes, since the list is available in the meta file after the producer build, couldn't there be a file-level SHA that is checked once the target-level SHA check fails? And then move only the matching SHA files to the expected location causing only the other files to compile?

I meant Caching units that are too granular, where every single piece of the compilation step is cacheable, may lead to extensive network traffic overhead, which can offset CPU savings.... The suggested approach (with file-level caching) is technically possible but introduces a huge complexity so the design decision was to only follow with the target-level caching.

What do you mean by "narrowed shell script build steps' output files"?

Note: That was just a hint that worked for me and it was not related to XCRemoteCache at all. I meant making sure no unrelated files are defined here:

Screenshot 2022-05-15 at 20 17 11

I think that #135 might fix these 5s delay you observe. With this fix, the expectation is that modifying locally Core should lead to a first build like 0.8x clean build (~500s).

polac24 commented 2 years ago

^ github autoclosed this issue. Reopening

@cezarsignori, please close it if #135 actually fixes the "delay".

cezarsignori commented 2 years ago

Hi @polac24 !

Thanks for the changes. Yes, I can confirm the delay issue is fixed.

Now incremental builds and clean builds with RC are comparable (when there aren't too many cache misses). For example, the test case where the cache is generated for commit 0886f8e6 and the build is run for 4b186ef2, clean builds (with cache misses) and incremental builds (clean build 0886f8e6, then incrementally build 4b186ef2) take the same amount of time. That is ~320s (Xcode without RC builds this use case in half the time when incremental and almost twice the time when in cleanly built).

I also verified that making a change to a module triggers a clean build of that module and all reverse dependencies (as expected) but only for the 1st time. Following incremental builds take the same time Xcode would without RC. For example, I changed that "Core" module (200+ reverse dependencies) after cleanly building 0886f8e6 with 100% cache-hit rate and it took 640s (compared to previous 1664s with RC and 78s without RC) and then made a different change to the same module and verified a 16s build. Making changes to other modules, cause those modules and their reverse dependencies to build cleanly, except those which were already incrementally built due to the other changes. Looks like RC allows Xcode to do optimisations for code paths already incrementally built.

And lastly, I found that incrementally building our app when a module with 70 reverse dependencies is modified, makes the incremental build time comparable to a non-RC clean build.

As it stands now, it is unclear if our app build times would improve in general with RC. Typically developers pull from master or rebase, build without deleting derived data and re-build the app with their changes in (multiple times). It is very likely they will experience the incremental build scenario once or twice before Xcode incremental build times kick-in. Depending on where the cache miss is at (what module CI is still to re-build or what module the developer changed locally), RC might not make a difference or it might even be better for them to disable RC. On the other hand, developers after RC is out, they are likely to have the code paths they usually work on already built by Xcode, taking advantage of RC for modules changed by other developers. We will need to run a test with the iOS community to find out.

Did/does Spotify face scenarios like that? Do you have recommendations for handling such scenarios?

polac24 commented 2 years ago

That sounds like an expected behavior - modifying a "God" target almost falls into the clean build but that happens really seldom.

At Spotify, we modularized our codebase into targets with buildtime in mind, which was beneficial even for the pure Xcode build because we don't invalidate previously built targets (back then, only those compiled locally).

The -70% buildtimes reported in the blogpost were taken from the real-world experiment - developers were working with a codebase normally with local changes and not always ready artifacts from CI. For the 100% cache hit rate, we achieved more than -80% so the difference was not huge.

cezarsignori commented 2 years ago

Thanks for the answer! We are also modularizing our codebase into targets with buildtime in mind (e.g. developer apps, interface modules, etc...) but that is a separate effort (from build system improvements) that takes a long time. Overtime, RC should give increased gains but I am hoping to get some already.

I was thinking about the fact the 1st incremental build is slower than the second. I think it is because RC disables itself after a cache miss, allowing Xcode to reuse already built file artifacts. Is that correct? I ask because since RC works by moving pre-built artifacts into the expected location when there is a cache hit, it means the first incremental build after a clean built should also be able to leverage that Xcode optimization. In order for that to happen, RC should disable when there is a cache hit too.

The hypothesis I have currently is that developers will only need to download remote artifacts when there are new commits and in our case, that means they re-generate the project via buck (that is also when we call xcprepare).

What if we added an option to RC (either via ENV or .rcinfo) that tells it to self-disable on cache hits? Then, we can re-enable RC (e.g. delete "disable file") whenever xcprepare is called (and perhaps cleverly, as in only when arc.rc file changes).

I believe that would mean that:

  1. Developers who clean build master get optimal cache hit (current behavior);
  2. Developers who rebase their feature branch with master get current behavior (clean build of changed target and rdeps);
  3. Developers who after building, make a change and re-build, leverage Xcode incremental build optimizations (much faster build);

Thoughts?

polac24 commented 2 years ago

I think it is because RC disables itself after a cache miss, allowing Xcode to reuse already built file artifacts. Is that correct?

Yes, every time XCRemoteCache notices a fingerprint mismatch, it switches itself off until the next merge/rebase (to be precise, arc.rc change).

it means the first incremental build after a clean built should also be able to leverage that Xcode optimization.

The Xcode optimization (omitting steps) is based on build products from the very last local incremental build. So the very first incremental build will take a full local time compilation.

Developers who rebase their feature branch with master get current behavior (clean build of changed target and rdeps);

That is actually what we have right now.

Developers who after building, make a change and re-build, leverage Xcode incremental build optimizations (much faster build);

It will not necessarily leverage an incremental build. Xcode will have to build from the previous state DerivedData state so either a clean build (when a target has never been compiled locally) or an incremental build from the more recent local build.

In essence: there will be no difference if we prematurely disable RC after a first build - actually we do that per target and the only cost is fingerprint calculation (negligible).

ahmednafei commented 2 years ago

Hi Bartosz,

Yesterday @cezarsignori validated the delay issue is fixed and first incremental build time noticeably improved with Xcode 13.2.1 by cherry-picking the required commits against RC 11 (below) and using test case scenarios as described in the previous post

Cherry-picked commits:

c573ced4f4e3b0a5359cec1436029778c0d16d5a
96ce18bc3127f4dfc1d5c3403230cabd38d215f8
ba58c1c21e2c4b10b64987d84bd938c1bb073521
59c1d999b1f9d74d1abebc26a06cd1791ec360c4

Although, today when building the app with the latest RC master, and using Xcode 13.3.1, we verified the first incremental build with RC takes twice as long as it does with Xcode 13.2.1. However, the "delay" issue is still NOT there.

So:

Following incremental builds take the same time Xcode would without RC. For example, I changed that "Core" module (200+ reverse dependencies) after cleanly building 0886f8e6 with 100% cache-hit rate and it took 640s (compared to previous 1664s with RC and 78s without RC)

Now this scenario on first incremental build after doing a change to Core module, with Xcode 13.3.1 takes 1500secs vs 640s on Xcode 13.2.1

Worth to mention that our app build time without RC did not change between xcode 13.2.1 and 13.3.1 and is averaged at ~560s

Also, may be worth to mention as a side note, we verified: that RC does not work with the new build system mode introduced by Xcode 13.3 and could be opted-in with defaults write com.apple.dt.XCBuild EnableSwiftBuildSystemIntegration 1

polac24 commented 2 years ago

EnableSwiftBuildSystemIntegration is indeed not supported - that was already verified in #47.

As you are experiencing issues with Xcode 13.3.1 - that seems like a separate issue, can you close this and create a new issue with a context? We use 13.3 and haven't observed any performance degradation so more details would be helpful to identify a reason.