AngleSharp / AngleSharp.Diffing

A library that makes it possible to compare two AngleSharp node lists and get a list of differences between them.
MIT License
37 stars 6 forks source link

More diffs. #36

Closed SebastianStehle closed 1 year ago

SebastianStehle commented 1 year ago

As discussed in #35

SebastianStehle commented 1 year ago

Oh, damn. I hate the merge conflicts from file scoped using :D

SebastianStehle commented 1 year ago

I have kept the names to reduce the number of changes. Unfortunately the Diff result is not typed, therefore I added a few asserts. Generics would work, but would also look ugly, e.g.

return CompareResult<NodeDiff>.Same;

So if you want to have more typesafety I would introduce concrete results like

NodeCompareResult and AttrCompareResult.

SebastianStehle commented 1 year ago

@egil Ready for your review. I think it should build. There is only a timeout with nuget. I have experienced this often recently, but I cannot retry the build.

egil commented 1 year ago

It's a reasonably big PR, so I must set aside time outside of work/family time to go through it. Hopefully, I can get it done in a few days.

SebastianStehle commented 1 year ago

Ofc. Thanks for the update

SebastianStehle commented 1 year ago

Btw: The changes are very repetitive.

  1. Add ToString() to all Diffs.
  2. New Diffs
  3. Return new diffs in comparers
  4. Adapt tests (I have to make 2 asserts, because IDiff does not implement Equals) 5.Unfortunately: I also changed 2 structs to records to simplify the code a little bit (should have made that in a custom PR).
egil commented 1 year ago

Notes. And thanks for your efforts with this. Ill get back to you ASAP.

SebastianStehle commented 1 year ago

Hi,

how is it going?

egil commented 1 year ago

Sorry, been hit with COVID so been completely out of commission the last few days. I'll take a look soon.

SebastianStehle commented 1 year ago

Oh, that sucks. I have gone through that. Get well soon.

egil commented 1 year ago

Have you changed your mind @SebastianStehle or did you close the PR by accident?

SebastianStehle commented 1 year ago

No, it was by accident :D

SebastianStehle commented 1 year ago

I have not made the diffs record, because of the Equals of AttributeComparisonSource. I was not sure, if equals is correct (especially IAttr) and if we can make it a record as well. If we make everything records, we can get rid of ToString()

egil commented 1 year ago

I have not made the diffs record, because of the Equals of AttributeComparisonSource. I was not sure, if equals is correct (especially IAttr) and if we can make it a record as well. If we make everything records, we can get rid of ToString()

If the tests doesn't break, then it should be ok.

SebastianStehle commented 1 year ago

I changed everything to records. There is more potential for simplifications, but I would not extend this PR even more, e.g.

  1. Convert all other structs to records.
  2. Get rid of IDiff and Introduce non generic DiffBase (then you have the guarantee that equals is always implemented properly).
  3. Get rid of IComparisonSource, it is not used.
egil commented 1 year ago

I have some additional changes. Can you give me push permissions to your branch so I can push them directly instead of having to write out each change as a comment/suggestion?

SebastianStehle commented 1 year ago

LGTM.

egil commented 1 year ago

Ahh no, did notice another thing I wanted to change. Instead of having a NodeClosingDiff types and NodeTypeDiff types, I think it is more correct to have an ElementDiff type with a Kind property, like we do with AttrDiff. Check the latest push. What do you think @SebastianStehle?

egil commented 1 year ago

What we lose is the node types are different. I do think that is actually obvious without having a specific diff type for that particular scenario.

SebastianStehle commented 1 year ago

What is the difference between element and node? I am not sure about the wording here.

I would also remove the Unspecified values for the enums, it is a little bit weird to have an unspecified diff

egil commented 1 year ago

What is the difference between element and node? I am not sure about the wording here.

A node is the base type for all types of content in a DOM, e.g. a comment or text, or element is a node. An element is a more specific thing.

I would also remove the Unspecified values for the enums, it is a little bit weird to have an unspecified diff

The "Unspecified" option is used in HtmlDifferenceEngine when a CompareResult is different, but no diff is returned. Then we new up a diff type and set its kind to unspecified.

egil commented 1 year ago

I also added a better default output, similar to what records will do, to ComparisonSource. That improves the output, e.g. the default options do not match up different types of nodes, but if we force it with this:

var control = "control";
var test = "<p>test</p>";

var diffs = DiffBuilder
    .Compare(control)
    .WithTest(test)
    .WithOptions(o => o.AddOneToOneNodeMatcher().AddElementComparer(false))
    .Build()
    .ToArray();

There will be a single diff in the diffs result, whose ToString call yields:

"{NodeDiff { Control = { Type = Text, Index = 0, Path = #text(0) }, Test = { Type = Element, Index = 0, Path = p(0) }, Target = Text, Result = Different }}"
SebastianStehle commented 1 year ago

Why have you not changed ComparisonSource to a record struct?

egil commented 1 year ago

Why have you not changed ComparisonSource to a record struct?

The Equals and GetHashCode do not align with what a record type does. We have a different way to do the equals methods than what records will do, so that's why they are not records.

egil commented 1 year ago

Anyway, if you do not have any more comments, ill merge this tomorrow and push a preview release out.

SebastianStehle commented 1 year ago

No, everything is fine

egil commented 1 year ago

Great. Thanks for your efforts with this.