kopia / kopia

Cross-platform backup tool for Windows, macOS & Linux with fast, incremental backups, client-side end-to-end encryption, compression and data deduplication. CLI and GUI included.
https://kopia.io
Apache License 2.0
7.7k stars 389 forks source link

policy set --add-ignore does not maintain ordering, breaking some ignore rules #3814

Open Hakkin opened 5 months ago

Hakkin commented 5 months ago

When adding ignore rules to a policy via policy set, the order of the added rules is not maintained, and instead they are re-sorted alphabetically.

This breaks some ignore rules because the ignore rules are inherently order dependent, take for example: --add-ignore "/*" --add-ignore "!/data"

This rule should exclude all files and directories under the base path, and then negate /data (so everything but /data is ignored). This pattern works if included in a .kopiaignore file, but fails when added as an ignore policy. When used as an ignore policy, the rules will be re-sorted alphabetically as:

!/data # Has no effect since there are no rules to negate yet
/*     # Excludes everything

If we instead do --add-ignore " /*" --add-ignore "!/data" (notice the space character before the first /), we can trick kopia into sorting this policy correctly, because ` (0x32) sorts before!(0x33`), and indeed this does work.

 /*    # Excludes everything
!/data # Negates /data from the previous exclude

This seems to be caused by how kopia applies policies: https://github.com/kopia/kopia/blob/a1ad8ce4422e52b95e17729f826ec5fa4fc0635e/cli/command_policy_set.go#L150-L153

The policies are first loaded into a map, so they are inherently unordered.

After applying the added and removed policies, they are then appended back to a slice and sorted: https://github.com/kopia/kopia/blob/a1ad8ce4422e52b95e17729f826ec5fa4fc0635e/cli/command_policy_set.go#L170-L175

This explains the alphabetic sorting.

It is possible to correct this via kopia policy edit and reordering the ignore array manually, but it will revert to the alphabetic sorting if you ever modify the policy again.

I'm not sure if this would be considered a bug, it seems like very unintuitive behavior though. It's not exactly clear how it could be fixed nicely either, obviously the applyPolicyStringList function could be updated to do the entire processing over a slice to maintain order, the performance of this compared to a map would be worse but unless people have millions of policy rules it's likely irrelevant. This would fix the issue, but then you have a new issue where there's no easy way to re-order the rules, all policies would be append only. You could manually edit the policy file, or remove and re-add all of the policies manually, but neither of those are great options.

It seems like the current individual policy rules don't work well for bulk order-dependent settings like ignore rules. It would be nice to have a way to attach something like a .kopiaignore file to a policy, where the rules are stored and processed as a group in order, without actually having to have the physical .kopiaignore file in the directory.

NiklausHofer commented 1 week ago

I can confirm this bug. We are having the exact same issue as @Hakkin :(