jamietre / CsQuery

CsQuery is a complete CSS selector engine, HTML parser, and jQuery port for C# and .NET 4.
Other
1.15k stars 249 forks source link

Queries on clones share state #139

Closed joelverhagen closed 11 years ago

joelverhagen commented 11 years ago

I have tested this issue against the 1.3.5-beta5 code on NuGet as well as the HEAD revision.

When you use CQ.Clone() followed by a query, the two results seem to share state. To clarify what I mean, take a look at this code:

CQ a = "<br>";
CQ b = a.Clone();

a["br"].FirstElement().Name = "a";
b["br"].FirstElement().Name = "b";

Console.WriteLine(a["br"].FirstElement().Name);
Console.WriteLine(b["br"].FirstElement().Name);

I would expect the output to be:

a
b

Unfortunately, the output is:

b
b

Clone itself seems to be working fine. For example, if I simply access the first element without a query, the output is as expected:

CQ a = "<br>";
CQ b = a.Clone();

a.FirstElement().Name = "a";
b.FirstElement().Name = "b";

Console.WriteLine(a.FirstElement().Name);
Console.WriteLine(b.FirstElement().Name);

Is this expected behavior?

joelverhagen commented 11 years ago

I dug a little deeper and noticed that cloned CQ objects shared parents (via CQ.CsQueryParent) which causes queries to always result on selections on the original document. The relevant line is in Clone.cs.

I changed the line to

CQ csq = NewCqUnbound();

This solved that problem as well as leaving all unit tests as passing. What was the rational behind clones sharing parents?

jamietre commented 11 years ago

There are a couple things to note here but first and foremost, I can't reproduce the actual problem you are describing here, which is changes to a cloned object affecting the original object. This test passes for me:

[Test]
public void Issue139() {
        CQ a = "<br>";
        CQ b = a.Clone();

        a.FirstElement().Name = "a";
        b.FirstElement().Name = "b";

        Assert.AreEqual("a",a.FirstElement().Name);
        Assert.AreEqual("b",b.FirstElement().Name);
}

But the behavior of a cloned object's selections operating against the parent DOM is by design and expected. The reason has to do with the relationship between a CQ object and a single DOM. In jQuery, you never have to worry about what DOM a selection of the sort $('div') operates against. There's only one (at least per browser context). In CsQuery, there could be any number of them. When you create new CQ objects from a given Document context, it always inherits the context of its owner. The only time a new DOM is created is when using one of the CQ.Create overloads.

In jQuery, when you clone something, it exists outside the actual DOM (they are disconnected elements) and there's no notion of running a CSS selector against a cloned hierarchy. You can use Find, or course, and you can in CsQuery too.

But in CsQuery, because we can actually specify which DOM we run a selector against, since there's an arbitrary number of them, the point here was to ensure that the relationship between cloned elements and their DOM was the same as in jQuery. This connection to its source also ensures that methods like End() work properly - which they could not, if a clone had no knowledge of it's originator.

If you actually want to create a new DOM context from a set of elements, you could do this instead of cloning:

var newDom = CQ.Create(a);

CQ.Create always creates a completely new object with no relationship to the source elements. This actually just clones the source and adds them to a new document (CreateNewFragment does the work for Create(IEnumerable<IDomElement>):

    protected void CreateNewFragment(IEnumerable<IDomObject> elements)
    {
        Document = DomDocument.Create(elements.Clone(), HtmlParsingMode.Fragment);
        AddSelection(Document.ChildNodes);
    }
joelverhagen commented 11 years ago

This clarifies the issue. I was confused about the purpose CQ.Clone. CQ.Create was the precise functionality that I needed. Thanks.

Closed.