Homebrew / homebrew-cask

🍻 A CLI workflow for the administration of macOS applications distributed as binaries
https://brew.sh
BSD 2-Clause "Simplified" License
20.85k stars 10.68k forks source link

cloning is broken #150323

Closed ilia-shipitsin closed 1 year ago

ilia-shipitsin commented 1 year ago

Verification

Description of issue

~# git clone https://github.com/homebrew/homebrew-cask
Cloning into 'homebrew-cask'...
remote: Enumerating objects: 751118, done.
remote: Counting objects: 100% (39001/39001), done.
remote: fatal: object 8b16853610463e75881196394236c34e02d21c52 cannot be read
remote: aborting due to possible repository corruption on the remote side.
fatal: protocol error: bad pack header
~#

Command that failed

git clone https://github.com/homebrew/homebrew-cask

Output of command with --verbose --debug

no

Output of brew doctor and brew config

no doctor

Output of brew tap

no
grzegorzkrukowski commented 1 year ago

brew update is also not working as it relies on cloning

SMillerDev commented 1 year ago

Yes it is, this is a GitHub issue which Homebrew has no control over.

brew update is also not working as it relies on cloning

It has not relied on that for users since February when 4.0.0 came out.

grzegorzkrukowski commented 1 year ago

@SMillerDev interesting, it is constantly failing on our CI starting last week with exactly this error :/

xiongchiamiov commented 1 year ago

By "a Github issue", do you mean something broken with fetches? I ask because there are no known incidents and it's been a few days since the last one with git operations.

If that is the case, perhaps this issue should be about error handling in homebrew that notifies the user why there's a problem, where to look on Github for confirmation of it, and who to contact if the problem continues.

In case it is helpful, here is a record of the brew update output:

[$]> brew update
remote: fatal: object 6bd3d16e36afd7f114fa56a8fb7364c6142817be cannot be read
remote: aborting due to possible repository corruption on the remote side.
fatal: protocol error: bad pack header
Error: Fetching /opt/homebrew/Library/Taps/homebrew/homebrew-cask failed!
Updated 2 taps (homebrew/core and homebrew/cask).
==> New Formulae
bbot                             erlang@25                        trzsz-ssh
==> New Casks
whisky
==> Outdated Formulae
libpaper                                          pygments

You have 2 outdated formulae installed.
You can upgrade them with brew upgrade
or list them with brew outdated.

The object id appears to change every time.

a6patch commented 1 year ago

We're having a similar issue in CI/CD using CircleCI:

==> Tapping homebrew/cask
Cloning into '/usr/local/Homebrew/Library/Taps/homebrew/homebrew-cask'...
remote: Enumerating objects: 751213, done.
remote: Counting objects: 100% (751213/751213), done.
remote: fatal: object dff4f2098bf27c69022ff2408bbd3b520cb5d02c cannot be read
remote: aborting due to possible repository corruption on the remote side.
fatal: early EOF
fatal: fetch-pack: invalid index-pack output
Error: Failure while executing; `git clone https://github.com/Homebrew/homebrew-cask /usr/local/Homebrew/Library/Taps/homebrew/homebrew-cask --origin=origin --template=` exited with 128.

Exited with code exit status 1
SMillerDev commented 1 year ago

By "a Github issue", do you mean something broken with fetches? I ask because there are no known incidents and it's been a few days since the last one with git operations.

It means our repo is hosted on github and we don't have control apart from pushing and pulling to it. If git says the repo is corrupt it's not something Homebrew can fix or affect in any way.

If that is the case, perhaps this issue should be about error handling in homebrew that notifies the user

That sounds like it's a feature request so it should be a pull request then.

why there's a problem,

We don't really know why our hosted repo is broken. Nor can we really get that information from git.

where to look on Github for confirmation of it

The issue tracker, but that is defined in https://docs.brew.sh/Troubleshooting already

and who to contact if the problem continues.

Nobody, we're all volunteers and nobody specifically can be contacted about Homebrew issues.

xiongchiamiov commented 1 year ago

(FWIW, this is now working for me.)

I'm not sure I quite communicated this well. Here's an example of what I might expect the software to do:

[$]> brew update
Error: Fetching /opt/homebrew/Library/Taps/homebrew/homebrew-cask failed!  It looks like something is broken with GitHub.
1. Check https://www.githubstatus.com/ to see if there's an ongoing incident.
2. If no incident, wait a few minutes and try again.
3. If the problem persists, contact GitHub support via https://support.github.com/.  Here's the relevant git output for them:

remote: fatal: object 6bd3d16e36afd7f114fa56a8fb7364c6142817be cannot be read
remote: aborting due to possible repository corruption on the remote side.
fatal: protocol error: bad pack header

That is, right now here seems to be the place to post about the problem, yet the response is "it's a github problem and we can't do anything about it". So if we can preemptively tell users that, we wouldn't need to have Homebrew issues created every time Github has a service outage.

sjorge commented 1 year ago
git fsck
git repack -adf --window=200 --depth=200
git pull

On my local copy seems to fix it for me?

basilisk487 commented 1 year ago

Where it breaks seems to be non-deterministic, and occasionally it just works:

% git clone https://github.com/Homebrew/homebrew-cask
Cloning into 'homebrew-cask'...
remote: Enumerating objects: 751234, done.
remote: Counting objects: 100% (751234/751234), done.
remote: fatal: object dff4f2098bf27c69022ff2408bbd3b520cb5d02c cannot be read
remote: aborting due to possible repository corruption on the remote side.
fatal: early EOF
fatal: fetch-pack: invalid index-pack output

% git clone https://github.com/Homebrew/homebrew-cask
Cloning into 'homebrew-cask'...
remote: Enumerating objects: 751234, done.
remote: Counting objects: 100% (38147/38147), done.
remote: Compressing objects: 100% (295/295), done.
remote: Total 751234 (delta 37904), reused 37983 (delta 37852), pack-reused 713087
Receiving objects: 100% (751234/751234), 337.32 MiB | 5.46 MiB/s, done.
Resolving deltas: 100% (536714/536714), done.

% rm -rf homebrew-cask
% git clone https://github.com/Homebrew/homebrew-cask
Cloning into 'homebrew-cask'...
remote: Enumerating objects: 751234, done.
remote: Counting objects: 100% (39117/39117), done.
remote: fatal: object 8b16853610463e75881196394236c34e02d21c52 cannot be read
remote: aborting due to possible repository corruption on the remote side.
fatal: early EOF
fatal: fetch-pack: invalid index-pack output

% git clone https://github.com/Homebrew/homebrew-cask
Cloning into 'homebrew-cask'...
remote: Enumerating objects: 751234, done.
remote: Counting objects: 100% (46448/46448), done.
remote: fatal: object e4f47fc0e72077fbd017e1d5193e97740017fbc4 cannot be read
remote: aborting due to possible repository corruption on the remote side.
fatal: early EOF
fatal: fetch-pack: invalid index-pack output

% git clone https://github.com/Homebrew/homebrew-cask
Cloning into 'homebrew-cask'...
remote: Enumerating objects: 96050, done.
remote: Counting objects: 100% (17822/17822), done.
remote: Compressing objects: 100% (9433/9433), done.
remote: Total 96050 (delta 11926), reused 8551 (delta 8389), pack-reused 78228
Receiving objects: 100% (96050/96050), 24.48 MiB | 8.05 MiB/s, done.
Resolving deltas: 100% (47935/47935), done.
fatal: did not receive expected object 0700f4889a0b855d63143cb7c1fdf4aeb944636c
fatal: fetch-pack: invalid index-pack output

Definitely a github and not a Homebrew issue

Groxx commented 1 year ago
git fsck
git pull # or brew update

worked for me too... but tbh I don't know why 🤔. And cloning from scratch still fails. (update: a coworker fscked and did not get unblocked, so this may have just been luck)

I'm under the vague impression that git fsck runs entirely locally (no checks of objects with your remotes), but if that's correct, how did a bunch of people's local git repos get corrupted simultaneously? Or is this a symptom of a strange change to the remote repo (removed tag/branch? something weirder? I have no idea how to cause this)?

I cloned and got the same error on a case-sensitive filesystem too (a Debian machine, running git 2.39.1), so I don't think it's a case collision of some kind. Hopefully it's just github and not the repository somehow.

a6patch commented 1 year ago

I don’t even have a local repo. Just doing a git clone gives the error before any local repo gets created.

On Mon, Jul 3, 2023 at 6:07 PM Steven L @.***> wrote:

git fsck -> git pull (or brew update) worked for me too... but tbh I don't know why 🤔

I'm under the vague impression that git fsck runs entirely locally (no checks of objects with your remotes), but if that's correct, how did a bunch of people's local git repos get corrupted simultaneously? Or is this a symptom of a strange change to the remote repo (removed tag/branch? something weirder? I have no idea how to cause this)?

— Reply to this email directly, view it on GitHub https://github.com/Homebrew/homebrew-cask/issues/150323#issuecomment-1619199736, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACDPXPRL2CMFED2FA2TJRB3XOM7CXANCNFSM6AAAAAAZ4TDO3I . You are receiving this because you commented.Message ID: @.***>

basilisk487 commented 1 year ago

@Groxx I don't think git fsck is relevant at all here. If widespread corruption of local copies was actually the issue, cloning from scratch would always resolve the issue and never fail.
These cloning errors seem to come and go in waves, occasionally there is a brief time window when all the clone/pull operations succeed. My guess - that's when traffic gets routed to a non-corrupted replica on github's side. I don't think git fsck does anything besides adding a delay, which increases the chances of next git operation landing on a different replica.

Bo98 commented 1 year ago

We're talking to GitHub about this issue and progress is being made.

Just to clarify things here, given somethings have changed since Homebrew 4.0.0:

basilisk487 commented 1 year ago

@Bo98 : thank you! I temporarily unblocked our release pipeline by pinning homebrew/actions version Homebrew/actions/setup-homebrew@6fef698e5a2d6da69b9b7c76ad7e9a268ae59192, but I definitely wouldn't recommend that approach to anyone unless they know what they are doing

Bo98 commented 1 year ago

@Bo98 : thank you! I temporarily unblocked our release pipeline by pinning homebrew/actions version Homebrew/actions/setup-homebrew@6fef698e5a2d6da69b9b7c76ad7e9a268ae59192, but I definitely wouldn't recommend that approach to anyone unless they know what they are doing

Would something like https://github.com/Homebrew/actions/pull/391 help your particular case?

basilisk487 commented 1 year ago

Would something like Homebrew/actions#391 help your particular case?

Perfect - this is exactly what I was looking for!

tri75 commented 1 year ago

Currently, look like git clone works well, but still failed from CircleCI.

Bo98 commented 1 year ago

Homebrew 4.0.27 contains a change for non-developers that skips homebrew-cask fetching if you don't have HOMEBREW_NO_INSTALL_FROM_API. So most users running brew update on their local machine should see the error go away after one more brew update (which will update the update script for subsequent runs). Again remember that you don't need to do brew tap homebrew/cask to install casks anymore under the default configuration.

This change does not apply to GitHub Actions as GitHub sets HOMEBREW_NO_INSTALL_FROM_API by default. The recommendation there for now continues to be to use Homebrew/actions/setup-homebrew@master (which now has a cask: false option if you don't use casks), or unset the env and brew untap homebrew/core homebrew/cask (note however for this approach that the env may be set again in subsequent steps, which could cause your workflow so become significantly slower if you use brew in subsequent steps).

For other CI providers, like CircleCI, I'd like to hear more about your setup. What base image are you using? What commands are you running?

In terms of the fixing the root cause of the underlying git issue, progress has been made with GitHub on this issue and hopefully there will be more to share soon.

Bo98 commented 1 year ago

GitHub has completed a rollout of a fix, and results so far indicate that the errors have stopped. I'll continue to monitor over the weekend.

Thanks to the git systems team at GitHub who have worked all week on tracking down and fixing the root cause.

We've also asked the GitHub Actions team to make some changes to their images that should eliminate the impact if this issue were to ever happen again. These changes will hopefully be rolled out soon.