libgit2 / libgit2sharp

Git + .NET = ❤
http://libgit2.github.com
MIT License
3.16k stars 888 forks source link

Git clone causes an "SSL error: syscall failure:" with a specific repo #2116

Open chaoscode opened 1 month ago

chaoscode commented 1 month ago

We are going to have to dump this library for this issue and move to using the cli instead. I figured I'd report this issue so maybe in the future someone can get around to fixing it. The issues has been reported before and have been open for over 4 years.

With a specific private repo, this error happens only on Alpine Linux in a container running as an ACA Job in Azure. Authentication is being used (Personal Access Token) with the user of "default"

"[Message] SSL error: syscall failure: "

No other logging or error message.

Reproduction steps

This is extremely difficult to reproduce. I don't even know what it is about this specific repo that is causing the problem as other, repos owned by the same company using the same token, clone fine.

If I run locally in docker desktop, it does not crash. Everything else being equal.

I see others have reported the same issue and the bug has been open for years.

https://github.com/libgit2/libgit2sharp/issues/1262

Running the same user/token and URL from the cli git client works without issues.

git clone https:{user}:{token}@github.com/path/to/repo

I'm doing some digging to see if I can get more info about why it's happening.

I was able to get a detailed stacktrace from the service.

        [Exception]: [07/31/2024 23:56:31]: (/_/LibGit2Sharp/Core/Ensure.cs):(154)
        [Exception]: [07/31/2024 23:56:31]: (/_/LibGit2Sharp/Core/Ensure.cs):(172)
        [Exception]: [07/31/2024 23:56:31]: (/_/LibGit2Sharp/Core/Proxy.cs):(278)
        [Exception]: [07/31/2024 23:56:31]: (/_/LibGit2Sharp/Repository.cs):(824)

at LibGit2Sharp.Core.Proxy.gitclone(String url, String workdir, GitCloneOptions& opts) in //LibGit2Sharp/Core/Proxy.cs:line 278 at LibGit2Sharp.Repository.Clone(String sourceUrl, String workdirPath, CloneOptions options) in /_/LibGit2Sharp/Repository.cs:line 824

Here is some feedback. The logging isn't very good in this function that's checking the response code from the code repository. If we could see the code that is being returned (and it doesn't seem to the HTTP OK) then we would have a better understanding about what the issue is.

This morning (8/1/2024) I implemented a cli version of git and it had no issues cloning from the ACA Job on the specific repo.

Expected behavior

The repo clones

Actual behavior

The library crashes

Version of LibGit2Sharp (release number or SHA1)

0.30

Operating system(s) tested; .NET runtime tested

Alpine Linux .NET 8.0

ethomson commented 1 month ago

This is extremely difficult to reproduce. I don't even know what it is about this specific repo that is causing the problem as other, repos owned by the same company using the same token, clone fine.

This is in the same environment? To make sure that I understand: you have some app running as an Azure Container App in an Alpine container, and it generally works and can clone most repositories from dev.azure.com, but there's one repository that doesn't work. In addition, this app can clone all repositories locally, including the problematic one. Is that an accurate description?

If we could see the code that is being returned (and it doesn't seem to the HTTP OK) then we would have a better understanding about what the issue is.

This isn't an HTTP protocol level problem. It may be a TLS-level problem, though I suspect not; syscall failure is a messaging coming from OpenSSL. It's some interaction with OpenSSL and Alpine's libc or kernel.

What version of Alpine are you using? What version of OpenSSL? Can you share your Dockerfile?

MattGal commented 2 weeks ago

I have seen this problem reproduce across various different linux machines when calling lib2git via Rust's https://crates.io/crates/git2 wrapper. It's not 100% but it was bad enough to stop running on Linux.

The exact same code works as expected on Windows. If there's anything to try to work around this, I'd love to hear it.

ethomson commented 2 weeks ago

Hi @MattGal - do you have any more details offhand?

Is it always the same repo or different repos? Has it ever happened with a public repo or always private? GitHub or somewhere else? Authenticated or anonymous?

MattGal commented 2 weeks ago

Is it always the same repo or different repos?

I hit this maintaining an app that pulls / pushes from ~30 distinct repos across GitHub / Azure Devops. There isn't any obvious correlation between specific repos hitting it but it does seem like, maybe, more changes being pushed leads to the problem happening more.

Has it ever happened with a public repo or always private? GitHub or somewhere else? Authenticated or anonymous?