jamietre / CsQuery

CsQuery is a complete CSS selector engine, HTML parser, and jQuery port for C# and .NET 4.
Other
1.16k stars 250 forks source link

Parent.InnerText Also Returns Child.InnerText? #168

Closed wjchristenson2 closed 9 years ago

wjchristenson2 commented 9 years ago

The following occurs w/ CsQuery 1.3.5-beta5 (Prerelease).

This did not occur in earlier versions of CsQuery. A node's InnerText property is now returning the InnerText for child nodes as well. See below.

li.InnerText is returning "Stock #:2575" instead of "2575".

CsQuery:

string stock = string.Empty;

var label = myCQ["label"].FirstOrDefault(i => i.InnerText.Contains("Stock #:", StringComparison.OrdinalIgnoreCase));
if (label != null)
{
    var li = label.ParentNode;
    stock = (li == null || string.IsNullOrWhiteSpace(li.InnerText) ? string.Empty : li.InnerText);
}

return stock;

Dom:

<li><label>Stock #:</label>2575</li>

Is this the new intended behavior? .innerText - .textContent has always been a debacle between browser support/etc. If so, can we get what "text-based" tags have their values used when creating the .InnerText string?

jamietre commented 9 years ago

This was actually a deliberate change, see change log here (near the bottom of "1.4 (prerelease)")

https://github.com/jamietre/CsQuery/tree/master/source

InnerText isn't actually a real W3C property but I thought it was useful because of the way it handles whitespace. The lack of returning child nodes was always a bug. If you want the value of just a single text node then the right way is to access the NodeValue property of the text node.

wjchristenson2 commented 9 years ago

Thanks for the clarification. I will mark as closed.