jamietre / CsQuery

CsQuery is a complete CSS selector engine, HTML parser, and jQuery port for C# and .NET 4.
Other
1.16k stars 250 forks source link

Reseting the selector on a CQ #137

Closed mcintyre321 closed 11 years ago

mcintyre321 commented 11 years ago

Hi, we're using CsQuery to create a post-rendering transformation pipeline using a Response Filter e.g.

                var doc = new CsQuery.CQ(response);
                foreach (Action<CQ> transform in transforms)
                {
                    doc = transform(doc);
                }                    
                output.Write(doc.Render());

where an example transform might be something like:

                cq => cq.Find(".actions-panel > a").addClass("button");

The problem we are having is that when the CQ is passed into a second transform, it now represents the set of elements from the first transforms .Find, rather than being an empty selector representing the whole document. What's the best way to reset to the root/empty CQ? Is it this:

                foreach (var transform in filterContext.HttpContext.Items.DocTransforms())
                {
                    doc = transform(doc);
                    doc = new CQ(doc.Document);
                }
jamietre commented 11 years ago

I think the most straightforward solution is to just not re-assign the response from the transform in each iteration.

            // I think you must be using Func<CQ,CQ> signature 
            // and not Action<CQ> in order to be getting a return from the transform

            foreach (Func<CQ,CQ> transform in transforms)
            {
                transform(doc);
            }    

Each CQ object which is spawned from a given root object refers to the same stateful Document which can be mutated by an action, analagous to document, and has a distinct Selection which is analagous to a jQuery object. Methods generally alter the DOM, or alter the selection, or both. But any method which causes the selection to change will return a new CQ object.

So since your only goal here is transformation, and you don't care about the result of a selector outside of the individual transform you're doing, I don't see any reason to chain the output of each transformation. Just keep running each successive transform against the original object which will ensure that it starts with a clean selection.

At the same time, you could avoid this problem entirely and also probably will get better performance, by using the default Select method instead of Find. Select works against the entire DOM no matter what; Find only queries within the children of the current selection. So if your intent is to query the whole document, just use select.

cq => cq.Select(".actions-panel > a").addClass("button");

or shorthand

cq => cq[".actions-panel > a"].addClass("button");
mcintyre321 commented 11 years ago

Of course! Whoops, should have realised the original CQ would remain unchanged, thanks!