Open c0d3h4x0r opened 2 years ago
@janakdr since you reviewed the change that introduced the flag: can you comment on potential correctness concerns of the flag?
I think the correctness is pretty good, since it uses inodes. However, does Windows have inodes? Also, I'm confused by this report, since the flag is not intended to help the case of a "no-op incremental build": in that case, Bazel should have all metadata in memory. The lower latency is supposed to happen after a flag flip, when we discard our in-memory actions, and it's supposed to apply to the output tree: source nodes don't get cleared after a flag flip. So I think a more detailed description of the Bazel commands run would be helpful (not targets, just flags).
Bazel should have all metadata in memory
That's what I thought at first as well but then remembered that windows doesn't have watchfs (see https://github.com/bazelbuild/bazel/issues/1931) and assumed that this is the reason why we don't trust the metadata. Is that assumption correct?
Ah, wasn't thinking. Yes, that could be it. But I think Windows doesn't have a node id. So we're down to mtime and size, which have been historically buggy. For instance, you have two files, header1.h and header2.h. They have the same size and mtime, but different contents. Now you copy header2.h onto header1.h. The node id would change, if we had it, but we don't.
Just learned here about --watchfs
. It's disabled by default. I think there is also a problem on Mac. I see lots of scanning just for simple "no-op" commands. It doesn't look like anything is cached by default.
This might be what I am seeing as well: https://groups.google.com/g/bazel-discuss/c/E_0F1t4FITE
@guw fyi:
The Windows filesystem watcher is still experimental as of Bazel 5.1.1 (the version we're now on). We encountered other problems when we tried to enable it, so it is not a viable substitute for using this undocumented performance option.
So what's the right path forward here? Right now, our business is taking a necessary dependency on this undocumented option, which isn't a position we want to remain in.
I think it would be helpful to know the exact sequence of Bazel commands (feel free to leave out targets) and also what modifications you did in between. And perhaps the Bazel output on the second command.
In the original repro report, it was just "bazel build ..." followed by another "bazel build ...", with no changes of any kind in between.
But the way we kick off our builds has changed, so the command is now different -- plus we've been using this undocumented flag for a long while now. I'd have to run a clean build from scratch (takes over 40 minutes the first time) using our latest bits, with the flag explicitly unset, to get a fresh repro log for you. Let me know if you still need that, and I'll suffer through it.
Sent via Outlook for iOShttps://aka.ms/o0ukef
From: Tobias Werth @.> Sent: Monday, April 25, 2022 12:48:09 AM To: bazelbuild/bazel @.> Cc: Keith F. Kelly @.>; Author @.> Subject: Re: [bazelbuild/bazel] cache_computed_file_digests is a useful performance option that should be documented (Issue #14401)
I think it would be helpful to know the exact sequence of Bazel commands (feel free to leave out targets) and also what modifications you did in between. And perhaps the Bazel output on the second command.
— Reply to this email directly, view it on GitHubhttps://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fbazelbuild%2Fbazel%2Fissues%2F14401%23issuecomment-1108199735&data=05%7C01%7C%7C8874c09fe67c493c852e08da268ff3db%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637864696962732020%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=utjJy2izmKUX6Ni%2FxbjT1AThhkAJ3xzqZMQP26zNz0E%3D&reserved=0, or unsubscribehttps://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAF2EYJHNHZ7OVGO2KO2VAL3VGZE3TANCNFSM5JXU6FDQ&data=05%7C01%7C%7C8874c09fe67c493c852e08da268ff3db%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637864696962732020%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=A0EaqDcyfUwuxkK59s2aFHNo3qoKFXbDJN1xzDKkmJg%3D&reserved=0. You are receiving this because you authored the thread.Message ID: @.***>
Before this gets flagged as stale, this is still a valid issue. This flag should be documented, and potentially the default increased as well.
cc @coeuvre
Doing some back-of-the-napkin math, a digest cache with 50k entries (current default) should consume around 11MB, assuming compressed oops, average pathname length of 100 bytes, and hash length of 32 bytes. So increasing the default is unlikely to cause problems. @meisterT, WDYT about making it, say, 100k?
It would also be nice to provide some sort of indication that the cache size is too small to be effective (e.g., if can't fit the inputs+outputs of the largest action in a build, or perhaps even an entire build's worth of them).
Description of the problem / feature request:
cache_computed_file_digests
is a useful performance option that should be documented.Feature requests: what underlying problem are you trying to solve with this feature?
This option is useful for improving incremental build speed on large codebases.
Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
Immediately following a full build, do a no-op incremental build on our codebase containing ~400,000 input files. It takes about 15 minutes.
Immediately following a full build, do a no-op incremental build with
--cache_computed_file_digests=400000
. It takes about 15 seconds.What operating system are you running Bazel on?
Windows 10
What's the output of
bazel info release
?If
bazel info release
returns "development version" or "(@non-git)", tell us how you built Bazel.n/a
What's the output of
git remote get-url origin ; git rev-parse master ; git rev-parse HEAD
?fatal: not a git repository (or any of the parent directories): .git
Our codebase is stored in a company-internal Perforce server, not in Git.
Have you found anything relevant by searching the web?
No.
Any other information, logs, or outputs that you want to share?
n/a