publicsuffix / list

The Public Suffix List
https://publicsuffix.org/
Mozilla Public License 2.0
2.05k stars 1.22k forks source link

tools/internal/parser: add diff support #2071

Closed danderson closed 3 months ago

danderson commented 3 months ago

List.SetBaseVersion takes an older base List, and makes Block.Changed() reflect whether or not each block has changed compared to that base.

2 commits: one is the refactor to introduce the blockInfo struct, the other is diff.go and tests.

This is a building block that sets change information on a parsed list, which validations can then use to ignore unchanged elements. Generalized tree diff is an open academic research problem, so instead of doing that, this implementation takes some shortcuts by exploiting some special properties of the PSL, and by accepting some false positives (we mark as changed some things that didn't change, or changed in a way doesn't matter for validation) in exchange for simpler code and no false negatives.

I ran this logic against the last 50 PSL changes, here are the git diffs vs. changed parse tree nodes. Some noteworthy illustrations of the semantics:

simon-friedberger commented 3 months ago

I merged your other PR. What is this conflict with and without toolchain?

danderson commented 3 months ago

That's an odd conflict, it seems like git should be able to resolve that without help. :shrug: I rebased this PR onto master and the conflict disappeared.

But actually looking at it more closely, I'm not sure why the toolchain directive got added. The code requires Go 1.22 for a few stdlib features, but forcing a minimum patch version of the toolchain is unnecessary. I suspect my emacs/LSP machinery added at some point during an automated edit. I added a commit to this PR to remove the toolchain directive and rely only on the go 1.22 directive.