gitgitgadget / git

GitGitGadget's Git fork. Open Pull Requests here to submit them to the Git mailing list
https://gitgitgadget.github.io/
Other
208 stars 135 forks source link

scalar: avoid segfault in reconfigure --all #1724

Closed derrickstolee closed 3 months ago

derrickstolee commented 4 months ago

Update 'scalar reconfigure --all' to be more careful with the_repository pointer, avoiding sudden halts in some scenarios.


I noticed this while validating the v2.45.0 release (specifically the microsoft/git fork, but this applies to the core project).

Thanks, Patrick, for digging in and finding the critical reasons why this issue can happen. Patrick ACKed the sign-off in v2.

v3 includes an update to the commit message. Sorry that I missed this paragraph when updating for v2.

-Stolee

cc: Patrick Steinhardt ps@pks.im

derrickstolee commented 4 months ago

/submit

gitgitgadget[bot] commented 4 months ago

Submitted as pull.1724.git.1714496333004.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-1724/derrickstolee/scalar-reconfigure-repo-handle-v1

To fetch this version to local tag pr-1724/derrickstolee/scalar-reconfigure-repo-handle-v1:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-1724/derrickstolee/scalar-reconfigure-repo-handle-v1
gitgitgadget[bot] commented 4 months ago

This branch is now known as ds/scalar-reconfigure-all-fix.

gitgitgadget[bot] commented 4 months ago

This patch series was integrated into seen via https://github.com/git/git/commit/de933952430c2e652e504b8c4a683c171a867775.

gitgitgadget[bot] commented 4 months ago

This patch series was integrated into seen via https://github.com/git/git/commit/85f6920d50d995b6214aa88e5038775d78026fc5.

gitgitgadget[bot] commented 4 months ago

There was a status update in the "New Topics" section about the branch ds/scalar-reconfigure-all-fix on the Git mailing list:

Scalar fix.

Comments?
source: <pull.1724.git.1714496333004.gitgitgadget@gmail.com>
teketemdn commented 4 months ago

This patch series was integrated into seen via git@85f6920.

``

Esta serie de parches se integró en visto a través de git@ de93395 .

gitgitgadget[bot] commented 4 months ago

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Tue, Apr 30, 2024 at 04:58:52PM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <stolee@gmail.com>
> 
> During the latest v2.45.0 update, 'scalar reconfigure --all' started to
> segfault on my machine. Breaking it down via the debugger, it was
> faulting on a NULL reference to the_hash_algo, which is a macro pointing
> to the_repository->hash_algo.
> 
> This NULL reference appears to be due to the way the loop is abusing the
> the_repository pointer, pointing it to a local repository struct after
> discovering that the current directory is a valid Git repository. This
> repo-swapping bit was in the original implementation from 4582676075
> (scalar: teach 'reconfigure' to optionally handle all registered
> enlistments, 2021-12-03), but only recently started segfaulting while
> trying to parse the HEAD reference. This also only happens on the
> _second_ repository in the list, so does not reproduce if there is only
> one registered repo.

Interesting. This also has some overlap with my patch series that aims
to drop the default-SHA1 fallback that we have in place for
`the_repository` [1].

> Add a test to t9210-scalar.sh to test 'scalar reconfigure --all' with
> multiple registered repos, as a precaution. Unfortunately, I was unable
> to reproduce the segfault using this test, so there is some coverage
> left to be desired. What exactly causes my setup to hit this bug but not
> this test structure? Unclear.

One way to trigger _a_ BUG is to use an "includeIf.onbranch:foobar.path"
config. This will cause Git to try and look up the "HEAD" reference, but
because we have a partially-configured repository, only, that will crash
with:

    BUG: refs.c:2123: reference backend is unknown

Whether that bug is the one that you have seen I cannot tell. It at
least does not sound like it.

    test_expect_success 'scalar reconfigure --all with includeIf.onbranch' '
        repos="two three four" &&
        for num in $repos
        do
            git init $num/src &&
            scalar register $num/src &&
            git -C $num/src config includeif."onbranch:foo".path something &&
            git -C $num/src config core.preloadIndex false || return 1
        done &&

        scalar reconfigure --all &&

        for num in $repos
        do
            test true = "$(git -C $num/src config core.preloadIndex)" || return 1
        done
    '

Another issue, which is likely the one you report here, is if you have a
repository with detached "HEAD". In that case we use `get_oid_hex()` to
parse "HEAD", which will implicitly use `the_hash_algo`. But because you
unset it in the second round this will fail with a segfault when you try
to access it:

    ./test-lib.sh: line 1069: 583995 Segmentation fault      (core dumped) scalar reconfigure --all

This is the following testcase:

    test_expect_success 'scalar reconfigure --all with detached HEADs' '
        repos="two three four" &&
        for num in $repos
        do
            rm -rf $num/src &&
            git init $num/src &&
            scalar register $num/src &&
            git -C $num/src config core.preloadIndex false &&
            test_commit -C $num/src initial &&
            git -C $num/src switch --detach HEAD || return 1
        done &&

        scalar reconfigure --all &&

        for num in $repos
        do
            test true = "$(git -C $num/src config core.preloadIndex)" || return 1
        done
    '

This issue should be fixed by my patch series in [1] because we start to
use `get_oid_hex_any()` to parse detached HEADs.

Anyway, your fix is indeed effective because with `repo_init()` we now
properly configure the repository.

[1]: https://lore.kernel.org/git/cover.1714371422.git.ps@pks.im/

Patrick
gitgitgadget[bot] commented 4 months ago

User Patrick Steinhardt <ps@pks.im> has been added to the cc: list.

gitgitgadget[bot] commented 4 months ago

This patch series was integrated into seen via https://github.com/git/git/commit/3a6be86bde3ca80b3d175498c6d8beaf9e2207f2.

gitgitgadget[bot] commented 4 months ago

This patch series was integrated into seen via https://github.com/git/git/commit/7885ec712fd3a3942682bf4e87874019202ef3f1.

gitgitgadget[bot] commented 4 months ago

There was a status update in the "Cooking" section about the branch ds/scalar-reconfigure-all-fix on the Git mailing list:

Scalar fix.

Comments?
source: <pull.1724.git.1714496333004.gitgitgadget@gmail.com>
gitgitgadget[bot] commented 4 months ago

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 5/2/24 2:46 AM, Patrick Steinhardt wrote:
> On Tue, Apr 30, 2024 at 04:58:52PM +0000, Derrick Stolee via GitGitGadget wrote:
>> From: Derrick Stolee <stolee@gmail.com>
>>
>> During the latest v2.45.0 update, 'scalar reconfigure --all' started to
>> segfault on my machine. Breaking it down via the debugger, it was
>> faulting on a NULL reference to the_hash_algo, which is a macro pointing
>> to the_repository->hash_algo.
>>
>> This NULL reference appears to be due to the way the loop is abusing the
>> the_repository pointer, pointing it to a local repository struct after
>> discovering that the current directory is a valid Git repository. This
>> repo-swapping bit was in the original implementation from 4582676075
>> (scalar: teach 'reconfigure' to optionally handle all registered
>> enlistments, 2021-12-03), but only recently started segfaulting while
>> trying to parse the HEAD reference. This also only happens on the
>> _second_ repository in the list, so does not reproduce if there is only
>> one registered repo.
> Interesting. This also has some overlap with my patch series that aims
> to drop the default-SHA1 fallback that we have in place for
> `the_repository` [1].

Thanks for this pointer. It indeed will help.

>> Add a test to t9210-scalar.sh to test 'scalar reconfigure --all' with
>> multiple registered repos, as a precaution. Unfortunately, I was unable
>> to reproduce the segfault using this test, so there is some coverage
>> left to be desired. What exactly causes my setup to hit this bug but not
>> this test structure? Unclear.
> One way to trigger _a_ BUG is to use an "includeIf.onbranch:foobar.path"
> config. This will cause Git to try and look up the "HEAD" reference, but
> because we have a partially-configured repository, only, that will crash
> with:
>
>      BUG: refs.c:2123: reference backend is unknown

This is a good extra find. After your explanation for the second

test, I'm confident that I was hitting the detached HEAD case on

my machine.

I will shamelessly steal your tests in my v2.

> This issue should be fixed by my patch series in [1] because we start to
> use `get_oid_hex_any()` to parse detached HEADs.
>
> Anyway, your fix is indeed effective because with `repo_init()` we now
> properly configure the repository.

I appreciate that your series will fix this in a ref-focused way. I think

this change could prevent other bad interactions with the_repository in

the future.

Thanks,

-Stolee
derrickstolee commented 4 months ago

/submit

gitgitgadget[bot] commented 4 months ago

Submitted as pull.1724.v2.git.1714874298859.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-1724/derrickstolee/scalar-reconfigure-repo-handle-v2

To fetch this version to local tag pr-1724/derrickstolee/scalar-reconfigure-repo-handle-v2:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-1724/derrickstolee/scalar-reconfigure-repo-handle-v2
gitgitgadget[bot] commented 4 months ago

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Sun, May 05, 2024 at 01:58:18AM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <stolee@gmail.com>
> 
> During the latest v2.45.0 update, 'scalar reconfigure --all' started to
> segfault on my machine. Breaking it down via the debugger, it was
> faulting on a NULL reference to the_hash_algo, which is a macro pointing
> to the_repository->hash_algo.
> 
> This NULL reference appears to be due to the way the loop is abusing the
> the_repository pointer, pointing it to a local repository struct after
> discovering that the current directory is a valid Git repository. This
> repo-swapping bit was in the original implementation from 4582676075
> (scalar: teach 'reconfigure' to optionally handle all registered
> enlistments, 2021-12-03), but only recently started segfaulting while
> trying to parse the HEAD reference. This also only happens on the
> _second_ repository in the list, so does not reproduce if there is only
> one registered repo.

I think this explanation could use an update now that we have figured
out the root cause likely being a detached HEAD, where `get_oid_hex()`
will then access a NULL pointer.

> My first inclination was to try to refactor cmd_reconfigure() to execute
> 'git for-each-repo' instead of this loop. In addition to the difficulty
> of executing 'scalar reconfigure' within 'git for-each-repo', it would
> be difficult to perform the clean-up logic for non-existent repos if we
> relied on that child process.
> 
> Instead, I chose to move the temporary repo to be within the loop and
> reinstate the_repository to its old value after we are done performing
> logic on the current array item.
> 
> Add tests to t9210-scalar.sh to test 'scalar reconfigure --all' with
> multiple registered repos. There are two different ways that the old
> use of the_repository could trigger bugs. These issues are being solved
> independently to be more careful about the_repository being
> uninitialized, but the change in this patch around the use of
> the_repository is still a good safety precaution.
> 
> Co-authored-by: Patrick Steinhardt <ps@pks.im>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> Signed-off-by: Derrick Stolee <stolee@gmail.com>
> ---
>     scalar: avoid segfault in reconfigure --all
>     
>     Update 'scalar reconfigure --all' to be more careful with the_repository
>     pointer, avoiding sudden halts in some scenarios.
>     
>     ------------------------------------------------------------------------
>     
>     I noticed this while validating the v2.45.0 release (specifically the
>     microsoft/git fork, but this applies to the core project).
>     
>     Thanks, Patrick, for digging in and finding the critical reasons why
>     this issue can happen. I've included Patrick's test cases and given him
>     co-authorship. I forged his sign-off, so could you please ACK that
>     sign-off, Patrick?
>     
>     -Stolee

Other than the above nit regarding the root cause analysis this patch
looks good to me, and I'm fine with the forged SOB. Thanks!

Patrick
gitgitgadget[bot] commented 4 months ago

This patch series was integrated into seen via https://github.com/git/git/commit/2754234e89949b6c7a8d62fa055490a470d0ac44.

gitgitgadget[bot] commented 4 months ago

There was a status update in the "Cooking" section about the branch ds/scalar-reconfigure-all-fix on the Git mailing list:

Scalar fix.

Expecting a final (and hopefully small) reroll.
cf. <Zjhufiq2DmBlVpIx@tanuki>
source: <pull.1724.v2.git.1714874298859.gitgitgadget@gmail.com>
gitgitgadget[bot] commented 4 months ago

This patch series was integrated into seen via https://github.com/git/git/commit/7eebfa61e1c19b2777c763e3c58b1e285e85ac75.

gitgitgadget[bot] commented 4 months ago

This patch series was integrated into seen via https://github.com/git/git/commit/21e3564715c4d906cb88980d6a75baf4537e376f.

derrickstolee commented 4 months ago

/submit

gitgitgadget[bot] commented 4 months ago

Submitted as pull.1724.v3.git.1715126749391.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-1724/derrickstolee/scalar-reconfigure-repo-handle-v3

To fetch this version to local tag pr-1724/derrickstolee/scalar-reconfigure-repo-handle-v3:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-1724/derrickstolee/scalar-reconfigure-repo-handle-v3
gitgitgadget[bot] commented 4 months ago

This patch series was integrated into seen via https://github.com/git/git/commit/8ad0227e93eaac885bfcc94b78fb71007bd3c519.

gitgitgadget[bot] commented 4 months ago

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Wed, May 08, 2024 at 12:05:49AM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <stolee@gmail.com>
> 
> During the latest v2.45.0 update, 'scalar reconfigure --all' started to
> segfault on my machine. Breaking it down via the debugger, it was
> faulting on a NULL reference to the_hash_algo, which is a macro pointing
> to the_repository->hash_algo.
> 
> In my case, this is due to one of my repositories having a detached HEAD,
> which requires get_oid_hex() to parse that the HEAD reference is valid.
> Another way to cause a failure is to use the "includeIf.onbranch" config
> key, which will lead to a BUG() statement.
> 
> My first inclination was to try to refactor cmd_reconfigure() to execute
> 'git for-each-repo' instead of this loop. In addition to the difficulty
> of executing 'scalar reconfigure' within 'git for-each-repo', it would
> be difficult to perform the clean-up logic for non-existent repos if we
> relied on that child process.
> 
> Instead, I chose to move the temporary repo to be within the loop and
> reinstate the_repository to its old value after we are done performing
> logic on the current array item.
> 
> Add tests to t9210-scalar.sh to test 'scalar reconfigure --all' with
> multiple registered repos. There are two different ways that the old
> use of the_repository could trigger bugs. These issues are being solved
> independently to be more careful about the_repository being
> uninitialized, but the change in this patch around the use of
> the_repository is still a good safety precaution.
> 
> Co-authored-by: Patrick Steinhardt <ps@pks.im>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> Signed-off-by: Derrick Stolee <stolee@gmail.com>

Thanks, this version looks good to me!

Patrick
gitgitgadget[bot] commented 4 months ago

This patch series was integrated into seen via https://github.com/git/git/commit/b368d899640f493f8a72a4c07cbb1e1adc7a1739.

gitgitgadget[bot] commented 4 months ago

This patch series was integrated into next via https://github.com/git/git/commit/eca398f4a5d449c40ec949574750443b15a6d254.

gitgitgadget[bot] commented 4 months ago

There was a status update in the "Cooking" section about the branch ds/scalar-reconfigure-all-fix on the Git mailing list:

Scalar fix.

Will merge to 'master'.
source: <pull.1724.v3.git.1715126749391.gitgitgadget@gmail.com>
gitgitgadget[bot] commented 4 months ago

This patch series was integrated into seen via https://github.com/git/git/commit/7b44f63c7920d41a286be3a38178eebfd26bdd73.

gitgitgadget[bot] commented 4 months ago

There was a status update in the "Cooking" section about the branch ds/scalar-reconfigure-all-fix on the Git mailing list:

Scalar fix.

Will merge to 'master'.
source: <pull.1724.v3.git.1715126749391.gitgitgadget@gmail.com>
gitgitgadget[bot] commented 3 months ago

This patch series was integrated into seen via https://github.com/git/git/commit/1838ac75ef3c649f4f551829d4d3385cb1666d68.

gitgitgadget[bot] commented 3 months ago

This patch series was integrated into seen via https://github.com/git/git/commit/260a89195b2673e5a898b7b8eb045178a9008a6e.

gitgitgadget[bot] commented 3 months ago

This patch series was integrated into seen via https://github.com/git/git/commit/0cff3423c467614398de00369a50bf0eefd82a33.

gitgitgadget[bot] commented 3 months ago

There was a status update in the "Cooking" section about the branch ds/scalar-reconfigure-all-fix on the Git mailing list:

Scalar fix.

Will merge to 'master'.
source: <pull.1724.v3.git.1715126749391.gitgitgadget@gmail.com>
gitgitgadget[bot] commented 3 months ago

This patch series was integrated into seen via https://github.com/git/git/commit/1e00d22ec514fdb1056314da4e8a4e2ce2d53ddd.

gitgitgadget[bot] commented 3 months ago

This patch series was integrated into master via https://github.com/git/git/commit/1e00d22ec514fdb1056314da4e8a4e2ce2d53ddd.

gitgitgadget[bot] commented 3 months ago

This patch series was integrated into next via https://github.com/git/git/commit/1e00d22ec514fdb1056314da4e8a4e2ce2d53ddd.

gitgitgadget[bot] commented 3 months ago

Closed via 1e00d22ec514fdb1056314da4e8a4e2ce2d53ddd.