ggshield.core.filter.leak_dictionary_by_ignore_sha() takes a list of policy breaks and returns a dict of sha => list[policy_breaks]. To do so it sorts the policy breaks, and the matches within the policy breaks. Since it sorts in-place, it modifies the input policy breaks, which is surprising and actually not necessary. Especially when the function is called in places where the caller only cares about the number of different shas.
This PR removes all sorting.
What has been done
First commit renames the function to group_policy_breaks_by_ignore_sha().
Second commit removes the sorting, and adjusts the tests so that they pass. This uncovered some holes in the tests, such as test_json_output() only checking indices of the first secret, and actually only the first match of said secret.
Validation
All tests pass.
PR check list
[x] As much as possible, the changes include tests (unit and/or functional)
[x] If the changes affect the end user (new feature, behavior change, bug fix) then the PR has a changelog entry (see doc/dev/getting-started.md). If the changes do not affect the end user, then the skip-changelog label has been added to the PR.
Context
ggshield.core.filter.leak_dictionary_by_ignore_sha()
takes a list of policy breaks and returns a dict of sha => list[policy_breaks]. To do so it sorts the policy breaks, and the matches within the policy breaks. Since it sorts in-place, it modifies the input policy breaks, which is surprising and actually not necessary. Especially when the function is called in places where the caller only cares about the number of different shas.This PR removes all sorting.
What has been done
First commit renames the function to
group_policy_breaks_by_ignore_sha()
.Second commit removes the sorting, and adjusts the tests so that they pass. This uncovered some holes in the tests, such as
test_json_output()
only checking indices of the first secret, and actually only the first match of said secret.Validation
All tests pass.
PR check list
skip-changelog
label has been added to the PR.