Closed meteorcloudy closed 4 months ago
@bazel-io fork 7.0.0
I can confirm this still happens even if upgrading commons-compress to the latest version (1.25.0)
/cc @tjgq @Wyverald
The error is from https://github.com/search?q=repo%3Aapache%2Fcommons-compress+%22Truncated+TAR+archive%22&type=code, could there be an actual problem with the tar file?
I can confirm this is an issue, but having spent a fair chunk of time trying to understand the TAR format, I can only deduce that the issue stems from somewhere within the Apache Commons compress library. In any case, this wouldn't be a 7.0.0 regression; I'm pretty sure that we never supported sparse TARs. So I'm inclined to treat this as a "soft blocker" -- that is, if all non-soft blockers are resolved, we should release 7.0.0 and look to maybe resolve this in a patch release.
could there be an actual problem with the tar file?
GNU tar extracts the file just fine, so I'd say this is some feature disparity in the Java library.
@FrancoisPoinsot since the root cause lies in commons-compress, there is little we can do in Bazel without a upstream fix. I'll have to downgrade this to P2 and remove it as a release blocker for 7.0
@meteorcloudy is there an issue filed on commons-compress for this? Do you need community help to file that issue with a minimal repro? I'd really like to see the upstream maintainers response to this.
As this was bumped from Bazel 7 I'm now going to be forced to add repository rules to call BSD tar to replace Bazel's extract logic, which will be some sad, long-lived tech debt :(
is there an issue filed on commons-compress for this?
I tried, but didn't find any relevant issue.
Do you need community help to file that issue with a minimal repro? I'd really like to see the upstream maintainers response to this.
Yes, that would be very helpful! I'm currently stressed by some CI issues, unfortunately.
@FrancoisPoinsot since the root cause lies in commons-compress, there is little we can do in Bazel without a upstream fix. I'll have to downgrade this to P2 and remove it as a release blocker for 7.0
As far as I know, the problem is not new to 7.0.0. I can confirm it was also present in 6.x.
My current workaround is to extract the file using tar command and reference the extracted file using an http_file
rule.
Repro is trivial:
#!/usr/bin/env bash
set -o errexit -o nounset
echo "Downloading commons-compress"
wget https://repo1.maven.org/maven2/org/apache/commons/commons-compress/1.25.0/commons-compress-1.25.0.jar
echo "Downloading sample sparse archive"
wget https://github.com/astral-sh/ruff/releases/download/v0.1.6/ruff-aarch64-apple-darwin.tar.gz
gunzip ruff-aarch64-apple-darwin.tar.gz
echo "Testing with system tar"
tar -tf ruff-aarch64-apple-darwin.tar
echo "Testing with commons-compress"
java -jar commons-compress-1.25.0.jar ruff-aarch64-apple-darwin.tar
->
Testing with system tar
ruff
Testing with commons-compress
Analysing ruff-aarch64-apple-darwin.tar
Created org.apache.commons.compress.archivers.tar.TarArchiveInputStream@17f052a3
ruff
Exception in thread "main" java.io.IOException: Truncated TAR archive
at org.apache.commons.compress.archivers.tar.TarArchiveInputStream.read(TarArchiveInputStream.java:694)
at org.apache.commons.compress.utils.IOUtils.readFully(IOUtils.java:244)
at org.apache.commons.compress.utils.IOUtils.skip(IOUtils.java:355)
at org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:451)
at org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextEntry(TarArchiveInputStream.java:426)
at org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextEntry(TarArchiveInputStream.java:50)
at org.apache.commons.compress.archivers.Lister.listStream(Lister.java:79)
at org.apache.commons.compress.archivers.Lister.main(Lister.java:133)
The hard part is getting into my Jira account on the Apache foundation to file it. @tjgq do you have an account there to file it at https://issues.apache.org/jira/projects/COMPRESS/issues/COMPRESS-598?filter=allopenissues ? You're probably the better reporter as you've been doing the coding.
https://issues.apache.org/jira/browse/COMPRESS-124 seems relevant
https://issues.apache.org/jira/browse/COMPRESS-124 seems relevant
This seems to be about the originally missing support for sparse tarballs altogether. Our issue is more about the newly added support potentially having bugs.
I tried to sign up for a Jira account, which apparently requires human review and could take a few days. In the meantime, I sent an email to the mailing list (user@commons.apache.org); let's see if anyone picks it up.
As a workaround you can do:
http_file(
name = "ruff_macos",
sha256 = "263d8ec3fd317b47dfefeae84d96e1894f87526f788394df59a0c6b013dac5d7",
url = "https://github.com/astral-sh/ruff/releases/download/v0.1.8/ruff-0.1.8-x86_64-apple-darwin.tar.gz",
)
and then:
genrule(
name = "ruff_bin",
srcs = ["@ruff_macos//file"],
outs = ["ruff-bin"],
cmd = "tar -xvf $< && mv ruff $@",
)
since macOS tar handles this fine
Thanks Keith, I should have commented here that I worked around it in rules_lint in that way: https://github.com/aspect-build/rules_lint/pull/66/files#diff-88872655967d360b7907682cbc2461f815c86c2940469330183be99e6f1b3ec2R129-R137
A fix for this issue has been included in Bazel 7.2.0 RC1. Please test out the release candidate and report any issues as soon as possible. If you're using Bazelisk, you can point to the latest RC by setting USE_BAZEL_VERSION=7.2.0rc1. Thanks!
Description of the bug:
Context: https://github.com/bazelbuild/bazel/issues/20090#issuecomment-1819279500
Which category does this issue belong to?
External Dependency
What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
Can be reproduced on macOS with the same repo as https://github.com/bazelbuild/bazel/issues/20090#issue-1982707352
Which operating system are you running Bazel on?
macOS
What is the output of
bazel info release
?No response
If
bazel info release
returnsdevelopment version
or(@non-git)
, tell us how you built Bazel.No response
What's the output of
git remote get-url origin; git rev-parse master; git rev-parse HEAD
?No response
Is this a regression? If yes, please try to identify the Bazel commit where the bug was introduced.
No response
Have you found anything relevant by searching the web?
No response
Any other information, logs, or outputs that you want to share?
No response