bazelbuild / bazel

a fast, scalable, multi-language and extensible build system
https://bazel.build
Apache License 2.0
23.3k stars 4.09k forks source link

Case insensitive labels poisons cache #8009

Open ozio85 opened 5 years ago

ozio85 commented 5 years ago

Description of the problem / feature request:

When first building with a correct tag on Windows e.g. bazel build //MyLabel/Test:MyName

Then building with an incorrect tag (but will still work): bazel build //mylabel/Test:MyName

Will in some cases cause the cache to be poisoned (and will in all cases make c-file includes to generated files to be invalid)

Feature requests: what underlying problem are you trying to solve with this feature?

Require all platforms (not only linux) to have exact casing, otherwise throw an error.

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

See problem description above

What operating system are you running Bazel on?

Windows

What's the output of bazel info release?

0.23.2

Have you found anything relevant by searching the web?

-GitHub issues:

6401

7627

Any other information, logs, or outputs that you want to share?

Nope :) Great project otherwise! :)

meisterT commented 5 years ago

cc @laszlocsomor since this seems to be a Windows issue.

laszlocsomor commented 5 years ago

This is not a Windows bug, it just shows up on Windows where the filesystem is case-preserving but case-insensitive.

The cause is complex: Label creation and Package loading use Path/PathFragment objects, I assume they check file existence (e.g. "does foo/BUILD exist? if so, //foo is a package") which ignores casing on Windows.

The naive fix is to make the WindowsFileSystem case-sensitive, but I think that's too aggressive. Bazel should accept incorrectly-cased paths in some cases (e.g. if they come from a flags such as --output_user_root), but reject them in others (if a Label is mis-cased).

I think the correct fix is to validate casing just in Label creation and Package lookup. Target names are already case-sensitive because Bazel matches them against strings read from a BUILD file.

laszlocsomor commented 5 years ago

More thinking later, I realize fixing this in the package loading logic would be impractical.

Package loading involves package lookup, which involves checking if a BUILD file exists under some path. Skyframe (Bazel's "engine") evaluates the PackageLookupFunction which requests a FileValue (= stat) for the BUILD file. Special-casing these (but not other) FileValue lookups to be case-sensitive could have consequences I cannot foresee; I'm afraid Skyframe could draw an inconsistent image of the filesystem and crash.

Instead, we should probably fix this in the filesystem layer, i.e. in WindowsFileSystem.java and JavaIoFileSystem.java. I have a prototype change that makes Bazel case-sensitive on Windows, but I want to test it more before considering merging it.

ozio85 commented 5 years ago

I just noticed that if you write no root (on Windows) e.g.: bazel build ...

If a label is mixed-case, the label-root gets wrongly converted to lower case, which causes all generated files to fail.

Also all custom python tools gets permanently corrupted since the runfiles directory will contain the lower case path (and no files can be found). To fix this, the only way is to bazel clean --expunge and restart the build from scratch. Bazel does not sense that the casing is wrong, and therefore the error persists.

laszlocsomor commented 5 years ago

I just noticed that if you write no root (on Windows) e.g.: bazel build ...

If a label is mixed-case, the label-root gets wrongly converted to lower case, which causes all generated files to fail.

Could you give an example of that? What exactly fails?

ozio85 commented 5 years ago

I have created a repro: repro.zip

So first compiling: bazel.exe build ///Mymixedcaseroot/...

Contents of manifest file:

__main__/Mymixedcaseroot/__init__.py 
__main__/Mymixedcaseroot/MyTools/__init__.py 
__main__/Mymixedcaseroot/MyTools/my_tool.exe D:/build/f34gr6sy/execroot/__main__/bazel-out/host/bin/Mymixedcaseroot/MyTools/my_tool.exe
__main__/Mymixedcaseroot/MyTools/my_tool.py D:/repro/Mymixedcaseroot/MyTools/my_tool.py
__main__/Mymixedcaseroot/MyTools/my_tool.zip D:/build/f34gr6sy/execroot/__main__/bazel-out/host/bin/Mymixedcaseroot/MyTools/my_tool.zip

bazel.exe build ///MyMixedCaseRoot/...

Contents of manifest file remains the same, (and fetching runfiles will fail).

Note that "bazel clean" is not enough to remove the faulty tool, you require "bazel clean --expunge"

blasnier commented 5 years ago

Hello,

We had a similar issue in May. In our case it also impacted the content of generated code with protoc. In the deps of one of our target, we referred to a C++ gRPC library as //Foo/bar/v0:bar_cpp_grpc instead of //Foo/Bar/v0:bar_cpp_grpc.

When building the target, the file bar.proto was moved to a folder /Foo/bar/v0. This path was then used by protoc to generate code. For example, the C++ header bar.pb.h contained:

#define PROTOBUF_INTERNAL_EXPORT_Foo_2fbar_2fv0_2fbar_2eproto
//                                     ^-- The case impacted the #define here.

Same issue with class/struct naming. In our case bar.proto imports other ProtoBuf files and since the names/defines did not match, the build failed. It took some time to find the cause of the failure. Once the reference in deps was fixed, we had to run bazel clean --expunge and then the build passed.

laszlocsomor commented 5 years ago

@ozio85 : I see, thanks! So the runfiles manifest is wrong.

@baslas : Thanks for describing your use case. That looks similar to @ozio85's repro -- a generated file is affected by the wrong casing.

meisterT commented 4 years ago

Is this still an issue?

ozio85 commented 4 years ago

It seems that "experimental_check_label_casing" is not merged, so i guess it is.

We have implemented a check on windows to avoid this problem. The check was implemented quite recently. The problem is also affecting generated files, which gets uploaded to the remote cache with the wrong casing, which poisons the cache.

laszlocsomor commented 4 years ago

Yes it is still an issue. https://github.com/bazelbuild/bazel/pull/10345 fixes but I never managed to merge it.

sgowroji commented 1 year ago

Hi there! We're doing a clean up of old issues and will be closing this one. Please reopen if you’d like to discuss anything further. We’ll respond as soon as we have the bandwidth/resources to do so.

crt-31 commented 9 months ago

This is still an issue. Please add this flag.

Integrating this flag will be a good workaround for Bazel's current bugs regarding label casing. (add into Bazel 6 as well please).

fmeum commented 9 months ago

@sgowroji Could you reopen? I lack the permission.

carpenterjc commented 2 months ago

This is still an issue, we have work around, but still problems are getting through. We set our file system to case-senstive on windows, but still see issues. When we investigate we see people mix up the casing in labels.

We sometimes see the message:

Directory 'C:/workspace/foo' could not be processed: Inconsistent filesystem operations. File [C:/workspace]/[foo/BUILD] found in directory, but stat failed The results of the build are not guaranteed to be correct. You should probably run 'bazel clean' and investigate the filesystem inconsistency (likely due to filesystem updates concurrent with the build)