bazelbuild / bazel

a fast, scalable, multi-language and extensible build system
https://bazel.build
Apache License 2.0
22.97k stars 4.03k forks source link

Execution log + remote cache = very long build time #23319

Closed OparinE closed 3 weeks ago

OparinE commented 4 weeks ago

Description of the bug:

Official way of cache hit debugging is based on execution logs gathering cache-remote. However if we build with execution_logs ON and remote_cache ON build time increases dramatically (x2-x5).

Which category does this issue belong to?

No response

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

I have a target of 30 build actions (cpp code + packaging). Build time average (sec): no options:___28 disk cache in ON:_13 execution json log is ON:___28 execuion json log is ON, disk cache is ON:_59

Disk cache is ON build shows repos fetching time. If bare in mind this, (execuion json log is ON + disk cache is ON) / execution json log is ON = 46 / 15 = 3 times

Which operating system are you running Bazel on?

Windows

What is the output of bazel info release?

7.0.2

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse HEAD ?

No response

If this is a regression, please try to identify the Bazel commit where the bug was introduced with bazelisk --bisect.

No response

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

No response

tjgq commented 4 weeks ago

Consider using --experimental_execution_log_compact_file instead (see https://github.com/bazelbuild/bazel/issues/18643#issuecomment-1945078491). The --execution_log_json_file and --execution_log_binary_file formats are fundamentally inefficient; it's unlikely that they can be improved.

OparinE commented 4 weeks ago

@tjgq , I tries --experimental_execution_log_compact_file on bazel 7.2.0. General behavior is the same. + in my opinion, --execution_log_json_file works faster than --experimental_execution_log_compact_file, because there is no zipping operation.

tjgq commented 4 weeks ago

Could you share two trace profiles (--profile) for your build, with and without the compact execution log enabled, so I could better understand where the time is going? Ideally for a build where actions are all hitting the disk cache, which is the worst case for the execution log (very little work happening elsewhere).

Also, how big is the execution log (in compact format)?

On Fri, Aug 16, 2024 at 13:47 Evgenii Oparin @.***> wrote:

@tjgq https://github.com/tjgq , I tries --experimental_execution_log_compact_file on bazel 7.2.0. General behavior is the same.

— Reply to this email directly, view it on GitHub https://github.com/bazelbuild/bazel/issues/23319#issuecomment-2293448055, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABBK5HIACUVO52V34KT3U3LZRXYE7AVCNFSM6AAAAABMTSFJ2KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOJTGQ2DQMBVGU . You are receiving this because you were mentioned.Message ID: @.***>

OparinE commented 3 weeks ago

profile_with_cache_with_exe_logs.tar.gz profile_without_cache_with_exe_logs.tar.gz profile_without_exe_logs_with_cache.tar.gz

@tjgq , profiles are added.

tjgq commented 3 weeks ago

It looks like these are for the json format. Can you also capture a profile for the compact format under the same conditions?

in my opinion, --execution_log_json_file works faster than --experimental_execution_log_compact_file, because there is no zipping operation.

This shouldn't be the case for a typical build. The zstd compression is not the only difference between the two formats; the compact format avoids redundant work in other ways. (It's possible that your build is atypical, which is why I'd like to see both profiles).

OparinE commented 3 weeks ago

@tjgq , performance with --experimental_execution_log_compact_file= looks much better. There is no build time big difference between (exe_logs is ON + cache is ON) and (exe_logs is ON) as it was with json execution logs. Seems like support of json and pure binary logs from bazel perspective are not critical tasks, am I right? If so, please close the issue. I'll to switch to next bazel version.

tjgq commented 3 weeks ago

Thanks for checking that the compact format works for you!

I will close this issue, as I don't think it's possible to improve performance for the json/binary formats. The way forward is to point people to the compact format and deprecate them.