google / bazel-common

Common functionality for Google's open-source libraries that are built with bazel.
Apache License 2.0
87 stars 40 forks source link

jarjar_runner.sh is unreliable on case-insensitive filesystems (Mac) #136

Open thirtyseven opened 3 years ago

thirtyseven commented 3 years ago

Issue

When jarjar_runner.sh processes a jar that contains multiple files that differ only in the case of their filename, they will be incorrectly flagged as duplicates when running on a case-insensitive filesystem like macOS's default APFS root volume.

BUILD.bazel:

jarjar_library(
        name="test_shaded",
        rules=[],
        jars=["test.jar"]
)

Steps to reproduce:

Download test.zip (just a zip with three empty files: test, TEST, and build-data.properties)

% mv test.zip test.jar
% bazel build :test_shaded
INFO: Analyzed target //:test_shaded (0 packages loaded, 0 targets configured).
INFO: Found 1 target...

ERROR: /Users/*snip*/BUILD.bazel:11:15: Action test_shaded.jar failed: (Exit 1): bash failed: error executing command /bin/bash -c ... (remaining 1 argument(s) skipped)

Use --sandbox_debug to see verbose messages from the sandbox
Error: duplicate files in merged jar: test~
Target //:test_shaded failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 0.191s, Critical Path: 0.09s
INFO: 2 processes: 2 internal.
FAILED: Build did NOT complete successfully

Possible solutions

One solution I have tested is to modify jarjar_runner.sh to extract the jars to a temporary sparse disk image created with a case-sensitive filesystem mounted to a tmp directory. However, I've found that this approach is "leaky", i.e. if you ctrl-C the bazel build before it finishes, the user will have a volume mounted that they then will have to manually unmount. I tried to solve this by adding a trap command to jarjar_runner.sh, however, it never seemed to get triggered, probably due to some subtlety of being called from process-wrapper.

Another solution would be to replace jarjar_runner.sh with a Python script that uses the stdlib zipfile library to interact with the jar files to avoid interacting with the native FS.

ronshapiro commented 3 years ago

I'm not sure what makes the most sense here without adding in a lot of complexity.

Could we create a separate subdirectory for each jar so that there won't be any conflicts?

thirtyseven commented 3 years ago

Could we create a separate subdirectory for each jar so that there won't be any conflicts?

Not sure that would work, since the case I illustrated above only has one jar.