microsoft / git

A fork of Git containing Microsoft-specific patches.
http://git-scm.com/
Other
761 stars 92 forks source link

scalar: make GVFS Protocol a forced choice #648

Closed derrickstolee closed 4 months ago

derrickstolee commented 4 months ago

In the Office monorepo, we've recently had an uptick in issues with scalar clone. These issues didn't make sense at first and seemed like the users weren't using microsoft/git but instead the upstream version's scalar clone. Instead of using GVFS cache servers, they were attempting to use the Git protocol's partial clone (which times out).

It turns out that what's actually happening is that some network issue is causing the connection with Azure DevOps to error out during the /gvfs/config request. In the Git traces, we see the following error during this request:

(curl:56) Failure when receiving data from the peer [transient]

This isn't 100% of the time, but has increased enough to cause problems for a variety of users.

The solution being proposed in this pull request is to remove the fall-back mechanism and instead have an explicit choice to use the GVFS protocol. To avoid significant disruption to Azure DevOps customers (the vast majority of microsoft/git users who use scalar clone based on my understanding), I added some inferring of a default value from the clone URL.

This fallback mechanism was first implemented in the C# version of Scalar in microsoft/scalar#339. This was an attempt to make the Scalar client interesting to non-Azure DevOps customers, especially as GitHub was about to launch the availability of partial clones. Now that the scalar client is available upstream, users don't need the GVFS-enabled version to get these benefits.

In addition, this will resolve #384 since those requests won't happen against non-ADO URLs unless requested.

derrickstolee commented 4 months ago

This is a draft because I'm not sure how to feel about the change of behavior. My commit message is lacking, as well.

I was trying very hard to get this issue diagnosed and fixed before the 2.45.0 release, but I didn't have the necessary information until it reproduced in machines my team owns.

derrickstolee commented 4 months ago

While pairing with one of my engineers, we were able to isolate the reason different machines were having issue with the gvfs/config endpoint: the http.sslBackend setting was different. Something about the Azure DevOps network stack changed recently in a way that the gvfs/config endpoint stopped working with openSSL and switching to schannel works. Users already using schannel in their system config were good, while others had openSSL.

This change is now less of an emergency, because the recent increase in failures is understood. It would still be good to merge this because:

  1. Users who expect the GVFS protocol should not fall back to partial clone.
  2. Users who want to use partial clone against ADO should be able to choose that option.